EasyMiner easy association rule mining, classification and anomaly detection

For Developers

Use EasyMiner API in your project

EasyMiner is an academic data mining project providing data mining of association rules, building of classification models based on association rules and outlier detection based on frequent pattern mining. The full project is based on composition of components and services with fully documented REST APIs. Most of the components and services are available under open source Apache License, Version 2.0.

To use EasyMiner functionality in your own project, you can build for example a mashup application (or write an own data mining script) using the main REST API.  This API provides full functionality of EasyMiner, including also functions, which are not yet available in the GUI.

You can also extend EasyMiner by adding new algorithms - rule mining, outlier detection or scorer service. For this purpose, the integration component EasyMinerCenter provides documented interfaces in PHP.

REST API

In case you want to use EasyMiner features in your own project, the easiest way is to use the main API (provided by the EasyMinerCenter component). The API is available at URL <server>/easyminercenter/api, where <server> is the URL of the EasyMiner server.

For more information refer to the  API tutorial.

EasyMiner architecture

EasyMiner is based on composition of re-usable services with fully documented APIs. The main services in current version of EasyMiner are shown on the following figure:

The central component (service) is EasyMinerCenter. This component provides user account and tasks management, stores discovered association rules, provides an authentication service for other services. It also calls other services and provides the main graphical web user interface as well as the main API interface for integration of EasyMiner functionality into other projects and scripted data mining workflows.

  • Graphical UI: <server>/easyminercenter
  • API endpoint: <server>/easyminercenter/api
  • API documentation: <server>/easyminercenter/api

EasyMiner-Data is a web service for management of data sources. This service supports upload data in CSV and RDF data formats. Uploaded data are stored into database (data backend) - MySQL (MariaDB) or Hive.

  • API endpoint: <server>/easyminer-data/api/v1
  • API documentation: <server>/easyminer-data/index.html

EasyMiner-Preprocessing service supports creation of datasets for data mining. It takes data fields from data source created using EasyMiner-Data and creates attributes from the  data fields using one of these preprocessing algorithms: each value-one bin, intervals enumeration, nominal enumeration, equidistant intervals, equisized intervals.

  • API endpoint: <server>/easyminer-preprocessing/api/v1
  • API documentation: <server>/easyminer-preprocessing/index.html

EasyMiner-Miner is a web service that encapsulates data mining algorithms.  Supported data mining methods include association rule learning (apriori, FP-Growth), pruning and classification (CBA) and outlier detection (algorithms included in the fpmoutlier package).

  • API endpoint: <server>/easyminer-miner/api/v1
  • API documentation: <server>/easyminer-miner/index.html

EasyMiner-Scorer is a web service for testing of classification models based on association rules.

  • API endpoint: <server>/easyminer-scorer/v0.3
  • API documentation: <server>/easyminer-scorer/index.html

In all the URLs above, the <server> part should be replaced with the URL of the used EasyMiner server.

Extend EasyMiner functionality

The functionality of  EasyMiner is continuously extended with new  algorithms and data mining platforms systems. The two most recent additions are  outlier detection and Spark mining.

EasyMinerCenter supports several data sources (databases or data services), preprocessing services, mining algorithms and scorers. All these services are integrated using drivers written in PHP. Current version of this component supports three types of data mining backends - see the schema.

To integrate your own algorithm, you can use the following PHP interfaces:

EasyMinerCenter\Model\Data\Databases\IDatabase
EasyMinerCenter\Model\Preprocessing\Databases\IPreprocessing
EasyMinerCenter\Model\Mining\IMiningDriver
EasyMinerCenter\Model\Mining\IOutliersMiningDriver
EasyMinerCenter\Model\Scoring\IScorerDriver

A new driver has to be registered in the application configuration. The configuration is defined in the directory <server>/easyminercenter/app/config. The main configuration file is config.neon, configuration for the given server should be defined in the file config.local.neon. The configuration of the algorithm drivers is defined in the section "parameters" - for an example refer to configuration of already existing drivers.

Issue tracking & source code

Source code of most components of EasyMiner/R are available in public repositories on GitHub.com.

The main repository is KIZI/EasyMiner

The state of the code base with respect to maintaining rule mining and classification functionality is regularly checked with integration tests in  Travis CI.  Build Status 

The main repository includes main components of EasyMiner system as well as subprojects. To clone the main repository, run command:

git clone --recursive https://github.com/KIZI/EasyMiner.git

Structure of repositories

Frontend service

Backend services

Other source codes

Issue tracking

The issues are tracked separately for all GitHub projects listed in the previous paragraphs. If you find an error or if you have a suggestion to improve the EasyMiner functionality then you can add a new issue straight to the right GitHub project. In case you are not sure, which GitHub project to choose you can add the issue to the main repository, we will process it and move it to the right subproject.