A mathematical framework for COMIC-Tree: an undirected graphical model for T-cell receptors specificity
T-cells are a core component of the adaptive immune system: they play a major role in mounting an effective and tailored response to foreign pathogens, and they are also relevant in the context of cancer and cer- tain autoimmune diseases. T-cells receptors are protein complexes present on T-cells’ surface that are responsible for identifying foreign and own antigens. Given the complexity of protein-protein interactions, this identification process exhibits a quasi-stochastic behaviour that can be modeled with probabilistic and statistical models. Graphical models can represent a multivariate distribution in a convenient and transparent way as a graph. In this paper we introduce COMIC-Tree, an undirected graphical model for protein-protein interactions, and DrawCOMIC-Tree, a greedy algorithm based on conditional mutual information for learning COMIC-Tree structures. We provide a solid mathematical foundation for them, highlight some theoretical aspects, and test them empirically on a dataset of T-cell receptors.