The application of mRNA gene expression microarrays has proven to be an invaluable tool for the elucidation of mechanisms of diverse biological processes at the molecular level. In a microarray experiment, several thousands of genes are investigated in parallel. Mainly due to the high costs of the microarrays, gene expression studies are normally carried out with a rather limited set of conditions and repetitions, featuring an experimental design that focuses on a few very specific research questions. With time however, collecting microarray data sets brings in a new dimension into gene expression data analysis: the investigation of a large set of genes in a large set of experimental conditions. Analyzing such data is not trivial and requires sophisticated data mining solutions.
In this talk, first I review the common analysis methods of gene expression data and outline their limitations. Then the definition of association pattern discovery (APD) is introduced and the main steps of the well known Apriori algorithm are summarized. This is followed by the discussion of how APD methods can be applied to gene expression data. Such methods are able to find groups of co-regulated genes in which the genes are either up- or down-regulated throughout the identified conditions. These methods, however, fail to identify similarly expressed genes whose expressions change between up- and down-regulation from one condition to another. In order to discover these hidden patterns, we propose a new concept of mining co-regulated gene profiles. The high content of biologically relevant information in these patterns is demonstrated by the significant enrichment of co-regulated genes with similar functions. Our results show that the proposed method is an efficient tool for the analysis of gene expression data.