Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
623 views
in Technique[技术] by (71.8m points)

oracle - convert many date formats to a single formatted date

I want to bring a string which contains a date to a single format date. EX:

  • 13-06-2012 to 13-JUN-12
  • 13/06/2012 to 13-JUN-12
  • 13-JUN-2012 to 13-JUN-12
  • 13/jun-2012 to 13-JUN-12
  • ...

I tried to delete all special characters and after that use a function to transform that string into a single format of date. My function return more exceptions, I don't know why...

The function:

CREATE OR REPLACE FUNCTION normalize_date (data_in IN VARCHAR2)
    RETURN DATE
IS
    tmp_month         VARCHAR2 (3);
    tmp_day           VARCHAR2 (2);
    tmp_year          VARCHAR2 (4);
    TMP_YEAR_NUMBER   NUMBER;
    result            DATE;
BEGIN
    tmp_day := SUBSTR (data_in, 1, 2);
    tmp_year := SUBSTR (data_in, -4);

    --if(REGEXP_LIKE(SUBSTR(data_in,3,2), '[:alpha:]')) then 
    if(SUBSTR(data_in,3,1) in ('a','j','i','f','m','s','o','n','d','A','J','I','F','M','S','O','N','D')) then      
    tmp_month := UPPER(SUBSTR (data_in, 3, 3));
    else
    tmp_month := SUBSTR (data_in, 3, 2);
    end if;

    DBMS_OUTPUT.put_line (tmp_year);

    TMP_YEAR_NUMBER := TO_NUMBER (tmp_year);

    IF (tmp_month = 'JAN')
    THEN
        tmp_month := '01';
    END IF;

    IF (tmp_month = 'FEB')
    THEN
        tmp_month := '02';
    END IF;

    IF (tmp_month = 'MAR')
    THEN
        tmp_month := '03';
    END IF;

    IF (tmp_month = 'APR')
    THEN
        tmp_month := '04';
    END IF;

    IF (tmp_month = 'MAY')
    THEN
        tmp_month := '05';
    END IF;

    IF (tmp_month = 'JUN')
    THEN
        tmp_month := '06';
    END IF;

    IF (tmp_month = 'JUL')
    THEN
        tmp_month := '07';
    END IF;

    IF (tmp_month = 'AUG')
    THEN
        tmp_month := '08';
    END IF;

    IF (tmp_month = 'SEP')
    THEN
        tmp_month := '09';
    END IF;

    IF (tmp_month = 'OCT')
    THEN
        tmp_month := '10';
    END IF;

    IF (tmp_month = 'NOV')
    THEN
        tmp_month := '11';
    END IF;

    IF (tmp_month = 'DEC')
    THEN
        tmp_month := '12';
        END IF;

   -- dbms_output.put_line(tmp_day || '~'||tmp_year || '~' ||tmp_month);

    IF (LENGTH (tmp_day || tmp_year || tmp_month) <> 8)
    THEN
        result := TO_DATE ('31122999', 'DDMMYYYY');
        RETURN result;
    END IF;

 --   dbms_output.put_line('before end');
    result:=TO_DATE (tmp_day || tmp_month ||tmp_year , 'DDMMYYYY');
 --   dbms_output.put_line('date result: '|| result);
    RETURN result;
EXCEPTION
    WHEN NO_DATA_FOUND
    THEN
        NULL;
    WHEN OTHERS
    THEN
        result := TO_DATE ('3012299', 'DDMMYYYY');
        RETURN result;
        RAISE;
END normalize_date;

Usage

SELECT customer_no,
       str_data_expirare,
       normalize_date (str_data_expirare_trim) AS data_expirare_buletin
  FROM (SELECT customer_no,
               str_data_expirare,
               REGEXP_REPLACE (str_data_expirare, '[^a-zA-Z0-9]+', '')
                   AS str_data_expirare_trim
          FROM (SELECT Q1.set_act_id_1,
                       Q1.customer_no,
                       NVL (SUBSTR (set_act_id_1,
                                      INSTR (set_act_id_1,
                                             '+',
                                             1,
                                             5)
                                    + 1,
                                    LENGTH (set_act_id_1)),
                            'NULL')
                           AS str_data_expirare
                  FROM STAGE_CORE.IFLEX_CUSTOMERS Q1
                  WHERE Q1.set_act_id_1 IS NOT NULL
                  )
        );
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you have a sound idea of all the possible date formats it might be easier to use brute force:

create or replace function clean_date
    ( p_date_str in varchar2)
    return date
is
    l_dt_fmt_nt sys.dbms_debug_vc2coll := sys.dbms_debug_vc2coll
        ('DD-MON-YYYY', 'DD-MON-YY', 'DD-MM-YYYY', 'MM-DD-YYYY', 'YYYY-MM-DD'
         , 'DD/MM/YYYY', 'MM/DD/YYYY', 'YYYY/MM/DD', 'DD/MM/YY', 'MM/DD/YY');
    return_value date;
begin
    for idx in l_dt_fmt_nt.first()..l_dt_fmt_nt.last()
    loop
        begin
            return_value := to_date(p_date_str, l_dt_fmt_nt(idx));
            exit;
        exception
             when others then null;
        end;
    end loop;
    if return_value is null then
        raise no_data_found; 
    end if;
    return return_value;
exception
    when no_data_found then
        raise_application_error(-20000, p_date_str|| ' is unknown date format');
end clean_date;
/

Be aware that modern versions of Oracle are quite forgiving with date conversion. This function handled dates in formats which aren't in the list, with some interesting consequences:

SQL> select  clean_date('20160817') from dual;

CLEAN_DAT
---------
17-AUG-16

SQL> select  clean_date('160817') from dual;

CLEAN_DAT
---------
16-AUG-17

SQL> 

Which demonstrates the limits of automated data cleansing in the face of lax data integrity rules. The wages of sin is corrupted data.


@AlexPoole raises the matter of using the 'RR' format. This element of the date mask was introduced as a Y2K kludge. It's rather depressing that we're still discussing it almost two decades into the new Millennium.

Anyway, the issue is this. If we cast this string '161225' to a date what century does it have? Well, 'yymmdd' will give 2016-12-15. Fair enough, but what about '991225'? How likely is that the date we really want is 2099-12-15? This is where the 'RR' format comes into play. Basically it defaults the century: numbers 00-49 default to 20, 50-99 default to 19. This window was determined by the Y2K issue: in 2000 it was more likely that '98 referred to the recent past than the near future, and similar logic applied to '02. Hence the halfway point of 1950. Note this is a fixed point not a sliding window. As we move further from the year 2000 the less useful that pivot point becomes. Find out more.

Anyway, the key point is that 'RRRR' does not play nicely with other date formats: to_date('501212', 'rrrrmmdd') hurlsora-01843: not a valid month. So, use'RR'and test for it before using'YYYY'`. So my revised function (with some tidying up) looks like this:

create or replace function clean_date
    ( p_date_str in varchar2)
    return date
is
    l_dt_fmt_nt sys.dbms_debug_vc2coll := sys.dbms_debug_vc2coll
        ('DD-MM-RR', 'MM-DD-RR', 'RR-MM-DD', 'RR-DD-MM'
         , 'DD-MM-YYYY', 'MM-DD-YYYY', 'YYYY-MM-DD', 'YYYY-DD-MM');
    return_value date;
begin
    for idx in l_dt_fmt_nt.first()..l_dt_fmt_nt.last()
    loop
        begin
            return_value := to_date(p_date_str, l_dt_fmt_nt(idx));
            exit;
        exception
             when others then null;
        end;
    end loop;
    if return_value is null then
        raise no_data_found; 
    end if;
    return return_value;
exception
    when no_data_found then
        raise_application_error(-20000, p_date_str|| ' is unknown date format');
end clean_date;
/

The key point remains: there's a limit to how smart we can make this function when it comes to interpreting dates, so make sure you lead with the best fit. If you think most of your date strings fit day-month-year put that first; you will still get some wrong casts but less that if you lead with year-month-day.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...