## Data mining algorithm

A *data mining algorithm* is an algorithm (implemented in a computer program), designed to solve a data mining task. It takes as input a dataset of examples of a given datatype and produces as output a generalization (from a given class) on the given datatype. A specific data mining algorithm can typically handle examples of a limited set of datatypes: For example, a rule learning algorithm might handle only tuples of Boolean attributes and a boolean class.

In the OntoDM-core ontological framework, we consider three aspects of the DM algorithm entity: a DM algorithm (as a specification), a DM algorithm implementation, and a DM algorithm execution.

### Data mining algorithm as a specification

*Data mining algorithm* as a specification is a subclass of the IAO class *plan specification* having as parts a *data mining task*, an *action specification* (reused from IAO), a *generalization specification*, and a *document* (reused from IAO). The *data mining task* defines the objective that the realized plan should fulfill at the end giving as output a generalization, while the *action specification* describes the actions of the data mining algorithm realized in the process of execution. The *generalization specification* denotes the type of generalization produced by executing the algorithm. Finally, having a *document* class as a part allows us to connect the algorithm to the annotations of documents (journal articles, workshop articles, technical reports) that publish knowledge about the algorithm.

In analogy with the taxonomy of datasets, data mining tasks and generalizations, in OntoDM-core we also construct a taxonomy of data mining algorithms. As criteria, we use the data mining task and the generalization produced as the output of the execution of the algorithm.

### Data mining algorithm implementation

Data mining algorithm implementation is defined as a sub-class of the BFO class *realizable entity*. It is a concretization of a *data mining algorithm*, in the form of a runnable computer program, and has as qualities *parameters*. The parameters of the algorithm affect its behavior when the algorithm implementation is used as an operator. A parameter itself is specified by a *parameter specification* that includes its name and description.

### Data mining software

In OntoDM-core, we define data mining softwareas a sub-class of *directive information entity* (reused from IAO). It represents a specification of a *data mining algorithm implementation*. It has as parts all the meta-information entities about the software implementation such as: *source code*, *software version specification*, *programming language*, *software compiler specification*, *software manufacturer*, the *data mining software toolkit* it belongs to, etc. Finally, a *data mining software toolkit* is a specification entity that contains as parts *data mining software* entities.

### Data mining operator

*Data mining operator* is defined as sub-class of the BFO class *role*. In that context, it is a role of a data *mining algorithm implementation* that is realized (executed) by a *data mining algorithm execution* process.
*Data mining operator* has information about the specific *parameter setting* of the algorithm, in the context of the realization of the operator in the process of execution. The *parameter setting* is a subclass of *data item* (reused from IAO), which is a quality specification of a *parameter*.

### Data mining algorithm execution

In OntoDM-core, we define *data mining algorithm execution* as a sub-class of *planned process* (reused from the OBI ontology). A *data mining algorithm execution* realizes (executes) a *data mining operator*, has as input a *dataset*, has as output a *generalization*, has as agent a *computer*, and achieves as a planned objective a *data mining task*.