13 Sep

Time Zone Conversion

Data will not always be in the right format. Most time data it is stored in a database under the UTC timezone. Therefore after exporting the information, you might need to change the time to a local or different times zone for some descriptive analysis.

Import Preliminaries

In [53]:
%matplotlib inline
%config InlineBackend.figure_format='retina'

# Import modules
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import matplotlib as mpl
import numpy as np
import pandas as pd 
import pytz
import sklearn
import seaborn as sns
import warnings

# Import Model Selection 
from sklearn.model_selection import train_test_split, cross_val_score

# Set pandas options
pd.set_option('max_columns',1000)
pd.set_option('max_rows',30)
pd.set_option('display.float_format', lambda x: '%.3f' % x)

# Set plotting options
mpl.rcParams['figure.figsize'] = (8.0, 7.0)

# Set warning options
warnings.filterwarnings('ignore');

Import Data

In [54]:
# Create a dataset with time from a couple of different cities
date_range = pd.date_range(start='1/1/2018', periods=2, tz='America/Los_Angeles')
date_range = date_range.append(pd.date_range(start='1/1/2018', periods=2, tz='America/Chicago'))
date_range = date_range.append(pd.date_range(start='1/1/2018', periods=2, tz='America/Costa_Rica'))
date_range = date_range.append(pd.date_range(start='1/1/2018', periods=2, tz='Asia/Tokyo'))
date_range = date_range.append(pd.date_range(start='1/1/2018', periods=2, tz='Asia/Dubai'))
date_range

# Create a dataframe with the timezone informatino
timezones = pd.DataFrame(data={'date_time':date_range,
                   'value':  np.random.choice(np.arange(0,20), size=10, replace=True)})

# View sample of the dataframe
timezones
Out[54]:
date_time value
0 2018-01-01 00:00:00-08:00 6
1 2018-01-02 00:00:00-08:00 3
2 2018-01-01 00:00:00-06:00 16
3 2018-01-02 00:00:00-06:00 8
4 2018-01-01 00:00:00-06:00 12
5 2018-01-02 00:00:00-06:00 19
6 2018-01-01 00:00:00+09:00 14
7 2018-01-02 00:00:00+09:00 6
8 2018-01-01 00:00:00+04:00 11
9 2018-01-02 00:00:00+04:00 6

Timezone Modification

In [55]:
# Convert all time information into Pacific Standard Time
timezones.date_time = timezones.date_time.apply(lambda x: x.tz_convert('America/Los_Angeles'))
timezones
Out[55]:
date_time value
0 2018-01-01 00:00:00-08:00 6
1 2018-01-02 00:00:00-08:00 3
2 2017-12-31 22:00:00-08:00 16
3 2018-01-01 22:00:00-08:00 8
4 2017-12-31 22:00:00-08:00 12
5 2018-01-01 22:00:00-08:00 19
6 2017-12-31 07:00:00-08:00 14
7 2018-01-01 07:00:00-08:00 6
8 2017-12-31 12:00:00-08:00 11
9 2018-01-01 12:00:00-08:00 6
In [57]:
# Strip timezone information from date_time feature
timezones.date_time = timezones.date_time.apply(lambda x: x.replace(tzinfo=None))
timezones
Out[57]:
date_time value
0 2018-01-01 00:00:00 6
1 2018-01-02 00:00:00 3
2 2017-12-31 22:00:00 16
3 2018-01-01 22:00:00 8
4 2017-12-31 22:00:00 12
5 2018-01-01 22:00:00 19
6 2017-12-31 07:00:00 14
7 2018-01-01 07:00:00 6
8 2017-12-31 12:00:00 11
9 2018-01-01 12:00:00 6
In [59]:
# Add timezone information back to date_time feature
pst_tz = pytz.timezone('America/Los_Angeles')
timezones.date_time = timezones.date_time.apply(lambda x: pst_tz.localize(x))
timezones
Out[59]:
date_time value
0 2018-01-01 00:00:00-08:00 6
1 2018-01-02 00:00:00-08:00 3
2 2017-12-31 22:00:00-08:00 16
3 2018-01-01 22:00:00-08:00 8
4 2017-12-31 22:00:00-08:00 12
5 2018-01-01 22:00:00-08:00 19
6 2017-12-31 07:00:00-08:00 14
7 2018-01-01 07:00:00-08:00 6
8 2017-12-31 12:00:00-08:00 11
9 2018-01-01 12:00:00-08:00 6

Author: Kavi Sekhon