DateTimeParser
DateTimeParser is a tokenizer for strings that attempts to parse into a DateTime, NaiveDateTime if timezone is not determined, Date, or Time.
The biggest ambiguity between datetime formats is whether it's ymd
(year month
day), mdy
(month day year), or dmy
(day month year); this is resolved by
checking if there are slashes or dashes. If slashes, then it will try dmy
first. All other cases will use the international format ymd
. Sometimes, if
the conditions are right, it can even parse dmy
with dashes if the month is a
vocal month (eg, "Jan"
).
If the string consists of only numbers, then we will try two other parsers depending on the number of digits: Epoch or Serial. Otherwise, we'll try the tokenizer.
If the string is 10-11 digits with optional precision, then we'll try to parse it as a Unix Epoch timestamp.
If the string is 1-5 digits with optional precision, then we'll try to parse it as a Serial timestamp (spreadsheet time) treating 1899-12-31 as 1. This will cause Excel-produced dates from 1900-01-01 until 1900-03-01 to be incorrect, as they really are.
digits | parser | range | notes |
---|---|---|---|
1-5 | Serial | low = 1900-01-01 , high = 2173-10-15 . Negative numbers go to 1626-03-17 | Floats indicate time. Integers do not. |
6-9 | Tokenizer | any | This allows for "20190429" to be parsed as 2019-04-29 |
10-11 | Epoch | low = 1976-03-03T09:46:40 , high = 5138-11-16 09:46:39 | If padded with 0s, then it can capture entire range. Negative numbers not yet supported |
Planned Breaking Changes
parse_datetime
currently assumes00:00:00
time if it cannot be determined. This will likely change in a future version because it's better to have no information than have wrong information. If you want to assume00:00:00
, that's fine, but this library shouldn't assume it for you, or at least make it an option.parse_datetime
currently defaults to converting to UTC when the timezone is known. This default may change to keep the original timezone information. This will help for future timestamps since timezone rules change; converting to UTC too early may use rules that become outdated by the time the timestamp arrives. The option to convert to UTC will remain, but may not be default.- Introduce
parse
to parse as much as it can, but return any of the structs,%DateTime{}
%NaiveDateTime{}
%Date{}
or%Time{}
. It would be up to you to match on what the result is and do what you will. If you know you want the one specific struct, then you can continue to use the more-specific functions likeparse_date
.
Required reading
- Elixir DateTime docs
- Elixir NaiveDateTime docs
- Elixir Date docs
- Elixir Time docs
- Elixir Calendar docs
How to save datetimes for future events (when UTC is not the right answer)
- tldr: rules change, so don't convert to UTC too early. The future might change the timezone conversion rules.
Documentation
Examples
iex> DateTimeParser.parse_datetime("19 September 2018 08:15:22 AM")
{:ok, ~N[2018-09-19 08:15:22]}
iex> DateTimeParser.parse_datetime("2034-01-13")
{:ok, ~N[2034-01-13 00:00:00]}
iex> DateTimeParser.parse_date("2034-01-13")
{:ok, ~D[2034-01-13]}
iex> DateTimeParser.parse_date("01/01/2017")
{:ok, ~D[2017-01-01]}
iex> DateTimeParser.parse_datetime("1/1/18 3:24 PM")
{:ok, ~N[2018-01-01T15:24:00]}
iex> DateTimeParser.parse_datetime("1/1/18 3:24 PM", assume_utc: true)
{:ok, ~U[2018-01-01T15:24:00Z]}
# the ~U is a DateTime sigil introduced in Elixir 1.9.0
iex> DateTimeParser.parse_datetime(~s|"Dec 1, 2018 7:39:53 AM PST"|)
{:ok, ~U[2018-12-01T14:39:53Z]}
# Notice that the date is converted to UTC by default
iex> {:ok, datetime} = DateTimeParser.parse_datetime(~s|"Dec 1, 2018 7:39:53 AM PST"|, to_utc: false)
iex> datetime
#DateTime<2018-12-01 07:39:53-07:00 PDT PST8PDT>
iex> DateTimeParser.parse_time("10:13pm")
{:ok, ~T[22:13:00]}
iex> DateTimeParser.parse_time("10:13:34")
{:ok, ~T[10:13:34]}
iex> DateTimeParser.parse_datetime(nil)
{:error, "Could not parse nil"}
See more examples automatically generated by the tests
Installation
Add date_time_parser
to your list of dependencies in mix.exs
:
def deps do
[
{:date_time_parser, "~> 0.2.0"}
]
end