luminousmen
I P

Spark core concepts explained

Apache Spark is considered as a powerful complement to Hadoop, big data’s original technology. Spark is a more accessible, powerful and capable big data tool for tackling various big data challenges. It has become mainstream and most in-demand big data framework across all major industries. Spark has become part of the Hadoop since 2.0. And is one of the most useful technologies for Python Big...

Things you need to know about Hadoop and YARN being a Spark developer

Apache Spark is considered as a powerful complement to Hadoop, big data’s original technology. Spark is a more accessible, powerful and capable big data tool for tackling various big data challenges. It has become mainstream and most in-demand big data framework across all major industries. Spark has become part of the Hadoop since 2.0. And is one of the most useful technologies for Python Big...

Data Science. Correlation

There are a lot of engineers who have never been involved in statistics or data science. So, in order to build a data science pipelines or rewrite produced by data scientists code to an adequate, easily maintained code many nuances and misunderstandings arises from the engineering side. For these Data/ML engineers and novice data scientists, I make this series of articles. I'll try to explain...

Data Science. Measures

There are a lot of engineers who have never been involved in statistics or data science. So, in order to build a data science pipelines or rewrite produced by data scientists code to an adequate, easily maintained code many nuances and misunderstandings arises from the engineering side. For these Data/ML engineers and novice data scientists, I make this series of articles. I'll try to explain...

Data Science. Probability distributions

There are a lot of engineers who have never been involved in statistics or data science. So, in order to build a data science pipelines or rewrite produced by data scientists code to an adequate, easily maintained code many nuances and misunderstandings arises from engineering side. For these Data/ML engineers and novice data scientists, I make this series of articles. I'll try to explain some...

Data Science. Bayes theorem

There are a lot of engineers who have never been involved in statistics or data science. So, in order to build a data science pipelines or rewrite produced by data scientists code to an adequate, easily maintained code many nuances and misunderstandings arises from engineering side. For these Data/ML engineers and novice data scientists, I make this series of articles. I'll try to explain some...