Using TQDM

TQDM is awesome package. It's really buggy with its progress bar rendering for cells within notebooks, but it work great in Python script that y..{

Decision Tree Classifier

A decision tree classier is a straightforward tree-like model. The classifier is just a decision tree and split the classes on each layer via a ..{

Confusion Matrices

Confusion matrices are commonly used in most classifications problems. I used constantly in a recent fraud detection challenge to see the potent..{

Crosstab Table

The panda's crosstab function is really useful for creating when you want to create pivot tables for dimensional features. with one line of code..{

Downsampling

In supervised learning, many datasets contain data that is class imbalanced. Therefore you will have to downsample the majority class to match ..{

Import Matlab Data

Some data that is provided by universities appears as a .mat file. These file type is unique to matplotlib and can be imported via the io functi..{

Cross Validation and K-Fold

Cross-Validation is the general scoring technique for the model using only the training data set. The K-Fold cross-validation is a better heuri..{

DBSCAN

DBSCAN stands for density-based spatial clustering of applications with noise. The algorithim select random points on the hyperplane and if the ..{

Heatmaps

Heatmaps are crazy useful. I use them as diagnostic plots to take a look at the feature correlation in my data frame and to understand the cro..{

Histograms

Plotting histograms for distributions is a common task in every dataset. The basic blue histogram can be boring and dull in matplotlib. Therefo..{

K-Nearest Neighbours Classifier

KNN is a very simple machine learning algorithim. Given a distance parameter and nearest neighbours parameter. The algorithim use premise that ..{

Null Values

There are multiple ways to handle missing data. Some people come up with some very creative solutions. This notebook contains some basic method..{

Plotting Residuals

Plotting you residuals for regression problems is crazy useful. This is another diagnostic plot that you can use to figure out if you did someth..{

Randomized Grid Search

Manual hyperparameter searching? No way. Scikit Learn has a got an amazing random grid search function that can give us a hint into the best par..{

Standardization

As I have learned recently from a recent model. Standardization is recommended before training any machine learning model. If you wanted to scal..{

« Page 2 / 3 »