Enterprise Data. Analytical Methods. Visual Analysis.
Applying Quantitative Tools for Business Intelligence
Data mining refers to a collection of techniques, tools, and processes that leverage information theory, statistics, and mathematics to recognize, identify, or discover actionable patterns in large quantities of data. Data mining transforms voluminous data into business intelligence that possesses an informational advantage over elementary aggregated data and is used to describe and make informed predictions on data. Data Mining is employed as a tool for research, process improvement, quantitative marketing, customer relationship management, and homeland security. Several fields contribute to data mining, including statistics, decision sciences, and computer science.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
The art and science of data mining comprises at least six (6) tasks:
|
|
Data mining is an integral part of the enterprise data-warehousing and business-intelligence infrastructure, and emerges as a requisite discipline when analysis requires predictions of future behavior based on historical data, especially when discriminating between and among segments and subgroups. The data-mining team must extract, transform, and load (ETL) the data-warehouse data and prepare it for analysis before applying business rules to build and test a scoring model. Once the model is tested and validated against a representative subset of the data population, it is deployed and applied to identify items of interest in the entire data-warehouse population. |
|
Data mining employs several different algorithms. Techniques such as Automatic Cluster Detection, Decision Trees, and Neural Networks are tools of every data-miner's knapsack. For example, Decision Trees are an extremely versatile tool. There are two (2) types:
A process called recursive partitioning is employed to iteratively split the data into sovereign partitions. Once a partition is stable, it is assigned a node and a label. A pruning process then removes nodes or entire branches of nodes to improve the performance of the overall decision tree. |
Good Data Mining starts with good data. CentraLytics applies best practices in data engineering, database administration, data warehousing & business intelligence, and ETL to craft the best analytic data set for your organization. The data granularity must be studied and optimized, derived variables must be defined, outliers must be identified and characterized, and data aggregations must be stipulated. Special care must be taken with missing or incorrect data.
|
|
Once the data is prepared, CentraLytics undertakes to craft an effective and efficient predictive model. We apply the model against both training and testing sets, and install the model on a production server against evaluation data sets only after we have validated the model for stability, extensibility, and actionable performance metrics such as lift.
The promise of data mining is to provide business with efficiency increases across core business processes. Marketing, shop-floor utilization, product cost variance, labor utilization, and customer service can all be helped by transforming historical data into actionable predictive information.





