Knowledge Discovery and Data Engineering Research Group


EXPRES

EXplainer and PRedictor of tEmporal States

 


A short description

EXPRES is a temporal data mining tool which analyzes longitudinal data (i.e., multivariate time-series) representing the observation over time of attributes descriptive of a dynamic domain (e.g., process or phenomenon). Its functionalities are:

  • determining the states through which the process evolves;
  • discovering sequence(s) of events which trigger the transition from a state to the next one;
  • mining sequential patterns from sequences of triggering events mined for a set of pairs of states;
  • predicting the state next to the last known state;

In formal terms, first

  • given P:{a1,a2,...,am} a finite set of attributes and a collection MP:{MP1,MP2,...,MPn} of time-stamped measurements of P

(see Figure below)

  • determining a finite sequence S:{S1, S2,....,} of temporal states which represent distinct subsequences of MP

(see Figure below)

  • discovering a sequence of events 1, e1,....,} which may trigger the transition from Sj into Sj+1, j = 1, . . . , s - 1, where Sj and Sj+1are known

    then

    • givenP:{a1,a2,...,am}a finite set of attributes, a set M of collectionMhP:{MhP1,MhP2,...,MhPn}of time-stamped measurements of Peach of which is measured for the scenarioh, a set of pairs of states (Shj, Shj+1) each of which determined for the scenario h, a set DEf sequences of events h1, eh1,....,} each of which determined for the h-th (Shj, Shj+1)
    • mine a set of sequential pattern whose support exceeds a user-defined threshold.

Stateevent.jpg

 

  • A state Sj is defined in terms of

    • [tsj,tej], represents the time-period of occurrence of the state(t1< tsj< tej< tn),
    • Cj:{f1, f2, . . .} is a finite set of fluents, namely, facts in terms of the parameters Pthat are true during the time-period [tsj,tej],
    • {sv1,. . . ,svh,. . . svm} is a set of m symbolic values such that svhis a high-level description of the parameter ahduring the time-period [tsj,tej]
    • An evente is defined in terms of

      • [tF,tL], represents the time-period of occurrence of the event, (t1< tF< tL< tn),
      • Ea:{ea1,...,eak,..., eam'}⊆ is a finite set of fluents, namely, facts in terms of the parameters P that are true during the time-period [tsj,tej],
      • {sv1,. . . ,svh,. . . svm} is a set of m symbolic values such that svh is a high-level description of the parameter ah during the time-period [tsj,tej]

      Top of this page


      The functional architecture
      Expres Architecture

      The functional architecture of EXPRES consist of following components:

      SEGMENTATION - DETERMINATION OF TEMPORAL STATES- ATRE. MPare split through a multi-variate time-series segmentation which exploits first a first a top-down strategy then a bottom-up one. For each segment a characterization is produced by resorting to ILP ATRE System which return the element Cj.

      TRIGGERING EVENTS DISCOVERER - ATRE. For each pair of temporal states, this step first generates candidate events through a Change Mining technique and ATRE system, then selects the sequence of the most statistically ones

      EXPLANATORY PATTERN MINER - SPADA. This step mines sequential patterns from the sequences of events discovered in a finite set of scenarios. It exploits the SPADA relational pattern mining system.

      TIME SERIES FORECASTING ENSEMBLE - TEMPORAL STATE FORECASTING. This step predicts the possible state following to the last determined state. It exploits an ensemble of neural networks to predicts measurements following MPn} and ATRE system to induce a characterization in form of state of these measurements.

      Top of this page


      Application: Sleep Disorders

      In this scenario, the goal is to acquire knowledge among sleep, disordered breathing and cardiovascular disease, for instance temporal information about cardiovascular and breathing disorders (events) which may determine the change from a physiological stage (state) to another one during sleep. The dataset concerns the polysomnography of only one patient observed from 21.30 p.m. to 6.30 a.m. (Sleep Heart Health Study).

      Readme, Input data & Results.

      Top of this page


      Application: Air Pollution

      In this scenario, the goal is to acquire knowledge among pollutant emissions, meteorological conditions and hospitalization, for instance temporal information about events in the form of pollutant emissions, meteorological conditions which may determine critical human health conditions (states). The dataset concerns thirteen US cities and it comes from a study on the mortality, air pollution, and meteorological data in the period 1987-2000 (NMMAPS.)

      Readme, Input data & Results.

      Top of this page


      The distribution package

      EXPRES is provided as a java application and runs with a JVM 1.5 or higher (EXPRES download UserGuide) . Although the datasets are available in the CSV format, the application is supported by Oracle 10g DBMS, and makes use ofATRE and SPADAsystems. For further details, please contact the project team persons.

      Warning:The system EXPRES is free for evaluation, research and teaching purposes, but not for commercial purposes.

      Please Acknowledge

Top of this page


Project team

Project Leader Prof. Donato Malerba

LACAM Staff Corrado Loglisci

Students involved in the project Davide Coratza

Top of this page


Related publications

(in inverse chronological order)

  • LOGLISCI C, MALERBA D. (2009). A Temporal Data Mining Approach for Discovering Knowledge on the Changes of the Patient's Physiology. In: Proceedings of the 12th Conference on Artificial Intelligence in Medicine, AIME 2009, Verona, Italy, July 18-22, 2009, (pp. 26-35). ISBN/ISSN: 978-3-642-02975-2. doi:10.1007/978-3-642-02976-9, Springer-Verlag (GERMANY).
  • LOGLISCI C, MALERBA D. (2008). Discovering Triggering Events from Longitudinal Data. In: Proceedings of the 2008 IEEE International Conference on Data Mining Workshops. 15- 19 dicembre 2008. (pp. 248-256). ISBN/ISSN: 978-0-7695-3503-6. doi:10.1109/ICDMW.2008.136 LOS ALAMITOS, CA: IEEE Computer Society (UNITED STATES).
  • LOGLISCI C, MALERBA D. (2008). Mining Temporal Associations Between Air Pollution and Effects on the Human Health. In: BRITO P. Proceedings of COMPSTAT’2008 - Contributed Papers. (pp. 307-314). ISBN: 978-3-7908-2083- 6. doi:10.1007/978-3-7908-2084-3 HEIDELBERG: Physica-Verlag (GERMANY).
  • LOGLISCI C, MALERBA D. (2008). Discovering Explanations from Longitudinal Data. In: AN A., MATWIN S., RAS Z.W., SLEZAK D. Foundations of Intelligent Systems, ISMIS 2008. (vol. 4994, pp. 196-202). ISBN: 978-3-540-68122- 9. doi:10.1007/978-3-540-68123-6BERLINO: Springer-Verlag (GERMANY).
  • LOGLISCI C, BERARDI M. (2006): Segmentation of Evolving Complex Data and Generation of Models. In: Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops (ICDMW 2006), December 18, 2006, Hong Kong, IEEE Computer Society, 269-273.

Top of this page


corrado.loglisci@uniba.it


Last Update: Tue Apr 10 2007 14:21:36 GMT+0200 (ora legale Europa occidentale)


KDDE  Template

KDDE presentations have to be based on this template.

Group members and students who are taking a degree, are invited to use it.

Discovery Science 2016

The 19th International Conference on Discovery Science (DS 2016) will be held in Bari on October 2016, 19th-21st. KDDE Group is organizing it.

ALT 2016

Algorithmic Learning Theory 2016

Bari, Italy, 19-21 October, 2016.

Powered by CMSimple| Template: ge-webdesign.de| Login