Random Forest vs Gradient Boosting

Tree-based algorithms are very promising in daily data science problems but their some extended adaption makes them so popular nowadays. Here, random forest and gradient boosting are approaches instead of a core decision tree algorithm itself. In this post, we are going to compare this two techniques and discuss how they are similar and how they are different.

Green Trees by Andre Cook (Pexels)

Vlog

You can either continue to read this tutorial or watch the following video. They both cover the comparison of random forest and gradient boosting.


🙋‍♂️ You may consider to enroll my top-rated machine learning course on Udemy

Decision Trees for Machine Learning

Random Forest

Randomness comes from splitting the data set into many sub data sets randomly. Many decision trees explain the forest term.

Imagine a library with thousands of books. A single decision tree algorithm expect you to read all books one by one. Obviously, it will take a long time. On the other hand, hundreds of people come together and share the books in the library to read. In this case, each person will read tens of books and all books will be read soon. This is more acceptable, right? Random forest is very similar to this example. You will split the data set into many sub data sets and you will run separate decision tree algorithms for those sub data sets.

Gradient boosting

Gradient boosting boosts results with gradient descent algorithm. Its name comes from the combination of those two terms. It will build a decision tree with the data set as is. Then, it is going to build another one with the error of the previous one. In that way, we will have hundreds of sequential decision trees.

Notice that decision tree building in sequential trees requires to use the all data. We just replace the target labels with the errors in the past round. So, we still need to read all books in the library in all rounds in gradient boosting.

Nowadays, lightgbm and xgboost are popular gradient boosting implementations. They are acronym of light gradient boosting machines and extreme gradient boosting respectively.

Making a decision

In random forest, we will have many independent decision trees. To make a final decision, we will consider the most frequent one among those trees in classification tasks. That is why, we set the number of trees to a prime number. If you are going to use random forest for a regression task, then you should find the average of the results of many trees.

In gradient boosting, we will have many sequential and dependent decision trees. We are going to find the sum the results of those sequential decision trees to make a decision.

Tree algorithms

As mentioned before, the both random forest and gradient boosting are approaches instead of a core decision tree algorithm itself. They both require core decision tree algorithms to build trees.





If you are going to use random forest for a classification task, then you can use ID3, C4.5, CART or CHAID algorithms. You are able to use regression trees for regression tasks with random forest as well.

On the other hand, gradient boosting requires to use regression trees even for classification tasks.

Parallel running

Random forest can be run in parallel because the data set is splitted already and tree algorithms can be run for those independent data sets.

On the other hand, gradient boosting requires to run sequential trees in serial because the second tree requires the first one as input. Still, we are able to build branches in parallel in core decision tree algorithms. So, gradient boosting can be run in parallel partially.

Boosting

Finally, gradient boosting is not the only boosting technique. Adaboost is a pretty boosting technique as well!

Conclusion

So, we have mentioned two important bagging and boosting techniques in tree based machine learning. Nowadays, the both technique are highly adopted in Kaggle competitions. Even though there is no better one among them, their approach is a little bit different. We compare these methods and explained similar and different points.


Like this blog? Support me on Patreon

Buy me a coffee