Data

The main ingredient in the process of data mining is the data. In OntoDM-core, we model the data with a data specification entity that describes the datatype of the underlying data. For this purpose, we import the mechanism for representing arbitrarily complex datatypes from OntoDT ontology.

Descriptive and output data specification

In OntoDM-core, we distinguish between a descriptive data specification, that specifies the data used for descriptive purposes (e.g., in the clustering and pattern discovery), and output data specification, that specifies the data used for output purposes (e.g., classes/targets in predictive modeling). A tuple of primitives or a graph with boolean edges and discrete nodes are examples of data specified only by a descriptive specification. Feature-based data with primitive output and feature-based data with structured output are examples of data specified by both descriptive and output specifications.

Dataset

OntoDM-core imports the IAO class dataset (defined as `a data item that is an aggregate of other data items of the same type that have something in common') and extends it by further specifying that a DM dataset has part data examples.

OntoDM-core also defines the class dataset specification to enable reasoning about data and datasets. It specifies the type of the dataset based on the type of data it contains. Using data specifications and the taxonomy of datatypes from the OntoDT ontology, in OntoDM-core we build a taxonomy of datasets.


QR Code
QR Code Data (generated for current page)