Blog | luminousmen

Back

Descriptive and Inferential Statistics

There are a lot of engineers who have never been involved in the field of statistics or data science. But in order to build a data science pipelines or rewrite produced code by data scientists to an adequate, easily maintained code many nuances and misunderstandings arise from the engineering...

How to start your blog for 20 cents

The activities of web applications are uncertain, sometimes they serve a huge number of workloads, but sometimes they can idle without a large number of requests. The hosting of applications on cloud virtual machines forces us to pay for idle times too. To solve such a problem we must look at...

7 tips to make an effective Python Style Guide

Style guide. Some teams refer to it as their coding manual, coding standards or coding conventions, they tend to refer to the same thing. A style guide is a set of standards, principles, and rules set by a team that each developer should follow. Determining the most appropriate style guide for...

My unpopular opinion about black code formatter

In this post, I will try to gather all my thoughts on the topic of automatic code formatting and why I personally don't like this approach. We will go for the most hippy tool — Black. The focus is on consistency, not readability. there is only one style For some reason, python developers are...

The ultimate Python style guidelines

Coding guidelines help engineering teams to write consistent code which is easy to read and understand for all team members. Python has an excellent style guide called PEP8. It covers most of the situations you will step into while writing Python. I like PEP8, I believe there has been much...

Introduction to Pyspark join types

This article is written to visualize various types of joins, a cheat sheet so that all join types are listed in the same place with examples and without stupid circles. Aaaah, circles! I've tired of these explanations of joins with intersections of sets and circles. It seems to be both clear and...

The 5-minute guide to using bucketing in Pyspark

There are many different tools in the world, each of which solves a range of problems. Many of them are judged by how well and correct they solve this or that problem, but there are tools that you just like, you want to use them. They are properly designed and fit well in your hand, you do not...

Spark tips. Don't collect data on driver

There are many different tools in the world, each of which solves a range of problems. Many of them are judged by how well and correct they solve this or that problem, but there are tools that you just like, you want to use them. They are properly designed and fit well in your hand, you do not...

How to not leap in time using Python

If you want to display the time to a user of your application, you query the time of day. However, if your application needs to measure elapsed time, you need a timer that will give the right answer even if the user changes the time on the system clock. The system clock which tells the time of...

Spark tips. DataFrame API

There are many different tools in the world, each of which solves a range of problems. Many of them are judged by how well and correct they solve this or that problem, but there are tools that you just like, you want to use them. They are properly designed and fit well in your hand, you do not...