Research groups Compositional and Spatial Data Analysis (COSDA-UPC)

Compositional and Spatial Data Analysis (COSDA-UPC)

The group COSDA-UPC dedicates itself to the development and application of appropriate techniques for treating spatial and compositional data. The particular nature of spatial and compositional data requires specific methods and statistical techniques. The group conducts theoretical methodological work on the improvement of statistical methods for the analysis of compositional and spatial multivariate data, is highly involved in the analysis of data sets from these fields, and also develops freely available software for statistical analysis. The group COSDA-UPC particiaptes in a coordinated research project with the Research Group in Compositional Data Analysis from the University of Girona (UdG), and participates in the CoDa courses organized by this group and in the international CoDaWork meetings.

Currenty, the groups work statistical and compositional aspects in three research lines:

  • Biomarkers and genetic markers
  • Contamination, natural risks and climate change
  • Comparison and characterization of alimentation systems

Main projects

  • 1.

    Transferring compositional data methods into applied science and technology

    The TRANS-CODA project pretends to extend and apply methodology from the field of compositional data analysis. We carry out theoretical work related with the distribution of log-ratio coordinates, the representation of dependence (contingency tables, copulas) within the compositional context, and develop a compositional test for Hardy-Weinberg equilibrium. Most of our efforts are be directed towards the application of compositional statistical methodology in other fields of science. Our project has three main research lines: a) Biomarkers and genetic markers, b) Contamination, natural risks and climate change, c) Comparison and characterization of alimentation systems. The first research line is most relevant for Bioinformatics and concerns the analysis of large databases of biomarkers (the so-called "omics" data), and databases of genetic markers (single nucleotide polymorphisms (SNPs) and short tandem repeats (STRs)) using methods from multivariate analysis, statistical genetics and methods for compositional data analysis.

  • 2.

    Statiscal MEthods in resTRICted spaceS (METRICS)

    Classical statistical methods produce nonsensical results when are applied to constrained sample spaces. Typical results are predictive regions outside the sample space and spurious correlations. Consequently, statistical methods for these sample spaces should take into account their particular nature. The METRICS project (period 2013-2015) comprised four main lines: advances in theoretical and methodological issues, development of descriptive and exploratory techniques, studies and applications in other fields, and training and dissemination activities.



Bioinformatics expertise:

Group Leader:

Jan Graffelman

Bioinformatics services offered

  • Data-analysis

    Elaboration of statistical reports on the analysis on multivariate data sets, compositional data and genetic marker data in particular.