Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
177 views
in Technique[技术] by (71.8m points)

Parse any date in Java

I know this question is asked quite a bit, and obviously you can't parse any arbitrary date. However, I find that the python-dateutil library is able to parse every date I throw at it, all while requiring absolutely zero effort in figuring out a date format string. Joda time is always sold as being a great Java date parser, but it still requires you to decide what format your date is in before you pick a Format (or create your own). You can't just call DateFormatter.parse(mydate) and magically get a Date object back.

For example, the date "Wed Mar 04 05:09:06 GMT-06:00 2009" is properly parsed with python-dateutil:

import dateutil.parser
print dateutil.parser.parse('Wed Mar 04 05:09:06 GMT-06:00 2009')

but the following Joda time call doesn't work:

    String date = "Wed Mar 04 05:09:06 GMT-06:00 2009";
    DateTimeFormatter fmt = ISODateTimeFormat.dateTime();
    DateTime dt = fmt.parseDateTime(date);
    System.out.println(date);

And creating your own DateTimeFormatter defeats the purpose, since that seems to be the same as using SimpleDateFormatter with the correct format string.

Is there a comparable way to parse a date in Java, like python-dateutil? I don't care about errors, I just want it to mostly perfect.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Your best bet is really asking help to regex to match the date format pattern and/or to do brute forcing.

Several years ago I wrote a little silly DateUtil class which did the job. Here's an extract of relevance:

private static final Map<String, String> DATE_FORMAT_REGEXPS = new HashMap<String, String>() {{
    put("^\d{8}$", "yyyyMMdd");
    put("^\d{1,2}-\d{1,2}-\d{4}$", "dd-MM-yyyy");
    put("^\d{4}-\d{1,2}-\d{1,2}$", "yyyy-MM-dd");
    put("^\d{1,2}/\d{1,2}/\d{4}$", "MM/dd/yyyy");
    put("^\d{4}/\d{1,2}/\d{1,2}$", "yyyy/MM/dd");
    put("^\d{1,2}\s[a-z]{3}\s\d{4}$", "dd MMM yyyy");
    put("^\d{1,2}\s[a-z]{4,}\s\d{4}$", "dd MMMM yyyy");
    put("^\d{12}$", "yyyyMMddHHmm");
    put("^\d{8}\s\d{4}$", "yyyyMMdd HHmm");
    put("^\d{1,2}-\d{1,2}-\d{4}\s\d{1,2}:\d{2}$", "dd-MM-yyyy HH:mm");
    put("^\d{4}-\d{1,2}-\d{1,2}\s\d{1,2}:\d{2}$", "yyyy-MM-dd HH:mm");
    put("^\d{1,2}/\d{1,2}/\d{4}\s\d{1,2}:\d{2}$", "MM/dd/yyyy HH:mm");
    put("^\d{4}/\d{1,2}/\d{1,2}\s\d{1,2}:\d{2}$", "yyyy/MM/dd HH:mm");
    put("^\d{1,2}\s[a-z]{3}\s\d{4}\s\d{1,2}:\d{2}$", "dd MMM yyyy HH:mm");
    put("^\d{1,2}\s[a-z]{4,}\s\d{4}\s\d{1,2}:\d{2}$", "dd MMMM yyyy HH:mm");
    put("^\d{14}$", "yyyyMMddHHmmss");
    put("^\d{8}\s\d{6}$", "yyyyMMdd HHmmss");
    put("^\d{1,2}-\d{1,2}-\d{4}\s\d{1,2}:\d{2}:\d{2}$", "dd-MM-yyyy HH:mm:ss");
    put("^\d{4}-\d{1,2}-\d{1,2}\s\d{1,2}:\d{2}:\d{2}$", "yyyy-MM-dd HH:mm:ss");
    put("^\d{1,2}/\d{1,2}/\d{4}\s\d{1,2}:\d{2}:\d{2}$", "MM/dd/yyyy HH:mm:ss");
    put("^\d{4}/\d{1,2}/\d{1,2}\s\d{1,2}:\d{2}:\d{2}$", "yyyy/MM/dd HH:mm:ss");
    put("^\d{1,2}\s[a-z]{3}\s\d{4}\s\d{1,2}:\d{2}:\d{2}$", "dd MMM yyyy HH:mm:ss");
    put("^\d{1,2}\s[a-z]{4,}\s\d{4}\s\d{1,2}:\d{2}:\d{2}$", "dd MMMM yyyy HH:mm:ss");
}};

/**
 * Determine SimpleDateFormat pattern matching with the given date string. Returns null if
 * format is unknown. You can simply extend DateUtil with more formats if needed.
 * @param dateString The date string to determine the SimpleDateFormat pattern for.
 * @return The matching SimpleDateFormat pattern, or null if format is unknown.
 * @see SimpleDateFormat
 */
public static String determineDateFormat(String dateString) {
    for (String regexp : DATE_FORMAT_REGEXPS.keySet()) {
        if (dateString.toLowerCase().matches(regexp)) {
            return DATE_FORMAT_REGEXPS.get(regexp);
        }
    }
    return null; // Unknown format.
}

(cough, double brace initialization, cough, it was just to get it all to fit in 100 char max length ;) )

You can easily expand it yourself with new regex and dateformat patterns.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...