Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
680 views
in Technique[技术] by (71.8m points)

python - pytz: return Olson Timezone name from only a GMT Offset

I have a legacy application i'm going to need to supplement some data with. Currently, we have a DB table storing US (and its territories) zip codes, along with a GMT Offset, and a flag showing if that Zipcode uses daylight savings time. This was downloaded from some free provider, which I can't find the source to right now.

I now need to supplement this table with the full Olson name (e.g. America/New York) of each zipcode because that seems to be the only good way to convert a given date/time stored in the database as local to that purchaser into a UTC aware datetime object.

Here's a look at the table:

zip    state  city          lat      lon       gmt  dst 
00605  PR     AGUADILLA     18.4372  -67.1593  -4   f
02830  RI     HARRISVILLE   41.9782  -71.7679  -5   t
99503  AK     ANCHORAGE     61.1895  -149.874  -9   t

In another related table Purchases, I have a postres timestamp without tz column, which currently contains something like 2014-05-27T15:54:26, which represents some time local a purchase was made at that zip code. (ignore the stupidity of stripping out timezone information when saving these localized timestamps to the database)

The big ask is:

How can I create a normalized UTC time from that timestamp string for each zipcode in the zipcode table? This would assume that the timestamp was written to the DB as local to each of the example rows in the zipcode table.

For example, manually looking up the Olson timezone names for each item in the example table, I come up with the following:

>>> timestring = '2014-05-27T15:54:26'
>>> dt_naive = datetime.strptime(timestring, '%Y-%m-%dT%H:%M:%S')

>>> # First example - Puerto Rico (no DST since 1945)
>>> print pytz.utc.normalize(pytz.timezone('America/Puerto_Rico').localize(dt_naive))
2014-05-27 19:54:26+00:00

# Second example - Road Island (At that timestamp, UTC Offset was same as PR because of DST)
>>> print pytz.utc.normalize(pytz.timezone('US/Eastern').localize(dt_naive))
>>> 2014-05-27 19:54:26+00:00

# Third Example - Anchorage, AK (AKDT at timestamp)
>>> print pytz.utc.normalize(pytz.timezone('America/Anchorage').localize(dt_naive))
2014-05-27 23:54:26+00:00

I've seen several commercial products selling a zipcode database which can give me a zipcode -> timezone lookup. However, they seem to only give me "EST" for a given timezone. So, I thought I could map the list of possible timezones for US timezones (including territories) to an olson name for each. That might look something like this:

zipcode_olson_lookup = {
    ('PR', 'f', 'AST'): 'America/Puerto_Rico',
    ('AK', 'f', 'AKDT',): 'America/Anchorage',
    ('AK', 't', 'AKT',): 'America/Anchorage',
    ...
}

Any suggestions are greatly welcome!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

UTC offset by itself may be ambiguous (it may correspond to several timezones that may have different rules in some time period):

#!/usr/bin/env python
from datetime import datetime, timedelta
import pytz # $ pip install pytz

input_utc_offset = timedelta(hours=-4)
timezone_ids = set()
now = datetime.now(pytz.utc) #XXX: use date that corresponds to input_utc_offset instead!
for tz in map(pytz.timezone, pytz.all_timezones_set):
    dt = now.astimezone(tz)    
    tzinfos = getattr(tz, '_tzinfos',
                      [(dt.tzname(), dt.dst(), dt.utcoffset())])        
    if any(utc_offset == input_utc_offset for utc_offset, _, _ in tzinfos):
        # match timezones that have/had/will have the same utc offset 
        timezone_ids.add(tz.zone)
print(timezone_ids)

Output

{'America/Anguilla',
 'America/Antigua',
 'America/Argentina/Buenos_Aires',
 ...,
 'Cuba',
 'EST5EDT',
 'Jamaica',
 'US/East-Indiana',
 'US/Eastern',
 'US/Michigan'}

You can't even limit the list using pytz.country_timezones['us'] because it would exclude one of your examples: 'America/Puerto_Rico'.


If you know coordinates (latitude, longitude); you could get the timezone id from the shape file: you could use a local database or a web-service:

#!/usr/bin/env python
from geopy import geocoders # pip install "geopy[timezone]"

g = geocoders.GoogleV3()
for coords in [(18.4372,  -67.159), (41.9782,  -71.7679), (61.1895,  -149.874)]:
    print(g.timezone(coords).zone)

Output

America/Puerto_Rico
America/New_York
America/Anchorage

Note: some local times may be ambiguous e.g., when the time falls back during end of DST transition. You could pass is_dst=None to .localize() method to raise an exception in such cases.

Different versions of the tz database may have different utc offset for some timezones at some dates i.e., it is not enough to store UTC time and the timezone id (what version to use depends on your application).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...