# Blog

## pentaho Blog

Data Analytics stack comparison

Comparison on Pentaho, AWS, Azure and Open Source Stack based on Data Platform, Data Lake, Data Visualization, Data Science and General

25 Key Questions before you choose any stack for data analytics

What is the data storage platform you would like to use? What is the Data Ingestion [ETL] tools you would like to use? What is the Data Processing tools you would like to use?

Define Your KPI

A measurable value that can be evaluated over a specific time period, determine the gap between actual and targeted performance and determine organization effectiveness and operational

Increasing the Performance of ML Algorithms using Boosting Techniques

Boosting is a sequential process, where each subsequent model attempts to correct the errors of the previous model. The succeeding models are dependent on the previous model.

Obtaining Accurate Predictions using Ensemble Learning Techniques

Ensemble Methods mostly are used in winning machine learning competitions by devising sophisticated algorithms and producing results with high accuracy

Pattern Recognition using K-NN Algorithm

The K-Nearest Neighbors (K-NN) method of classification is one of the simplest methods in machine learning, it is essentially.

How Neural Networks can be used in Image and Speech Recognition

Artificial Neural Networks are the computational models inspired by the human brain.

How Can Companies Utilise Random Forest algorithm to make Smart Predictions

Random Forest is a supervised learning algorithm. As mentioned in the name, it creates a forest and makes it somehow random.

Text Classification using Support Vector Machines

Support vector machines are a type of supervised machine algorithm for learning which is used for classification and regression tasks.

Segmentation using K-Mean Algorithm

K-Means is an introductory algorithm to clustering techniques and it is the simplest of them. K-Means is an easy to implement and handy algorithm.

Dimensionality Reduction using Principal Component Analysis

Principal component analysis (PCA) is a technique used for identification of a smaller number of uncorrelated variables known as principal components from a larger set of data.

Using Logistic Regression to predict Loan Defaults

A Logistic Regression classifies observations by estimating the probability that an observation is in a particular category.