Tools of a Data Scientist
This is a comprehensive list of the main tools I use in my workflow. Plus an addition section on client related tools I have been exposed to. No..{
Train-Test Split
Spliting a dataset into a train and test set is the requirement for evaluating any machine learning model. Sklearn's train-test-split can be use..{
Fizzbuzz
The fizzbuzz question is an basic interview question for coders. Basically write some code that says 'Fizzbuzz' when a value is divisible by 15,..{
Regex in Python
As you deal with more and messier string data, regex starts becoming a very valuable skill that can save you a lot of time. It also has a second..{
Binning Feature
The binning of data into a few categorical groups can help us see the summarize the sparse continuous values to a few data points. This can be u..{
Convering Notebook to Slides
You can convert any notebook to slides using the following commands. Note that scrolling for fragments is now disabled by default. You can enabl..{
Label Encoding
Encoding a categorical feature into numeric values before processing the data through your machine learning model is now easier than ever, give..{
Resampling Datetime
When plotting times series data it becomes advantageous to reframe and aggregate the data by a period of time. Data Scientists that deal with ti..{
Styling a DataFrame
Styling a data frame is pleasant a surprising amount of times. You can define a coloring function for your data frame then apply the styling whe..{
Using Select Dtypes
Once you get used to using pandas, filtering dataframe for content quickly and efficiently can be a huge asset. Pandas introduced a new feature ..{
Using the OS Module
This notebook is a combination of little snippets of Python code from Python's OS module that can I found useful for a variety of tasks. These t..{
Create Dummy Variables
Pandas to the rescue again. The pandas \"pd.to_dummies\" function can be operated on dataframe to take one categorical feature and create a one-..{
Removing Outliers
How to handle outliers in a dataset requires a bit of intuition and domain expertise to get right for descriptive and predictive analytics. Yet ..{
« Page 3 / 3