Comparison on Pentaho, AWS, Azure and Open Source Stack based on Data Platform, Data Lake, Data Visualization, Data Science and General

What is the data storage platform you would like to use? What is the Data Ingestion [ETL] tools you would like to use? What is the Data Processing tools you would like to use?

K-Means is an introductory algorithm to clustering techniques and it is the simplest of them. K-Means is an easy to implement and handy algorithm.

Data visualization is the graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualization tools provide..

Linear Regression is an analytical technique used to model the relationship between several input variables and a continuous outcome variable.

In this blog, I will be explaining about how to further clean the technical data into a ‘Consistent Data’ and the methodologies adopted.