Добро пожаловать в форум, Guest  >>   Войти | Регистрация | Поиск | Правила | В избранное | Подписаться
Все форумы / Oracle Новый топик    Ответить
Топик располагается на нескольких страницах: [1] 2   вперед  Ctrl      все
 Периоды, объединение  [new]
Калина
Member

Откуда: Moskau
Сообщений: 2649
Имеется табличка
CREATE TABLE TST3
(
  F0     VARCHAR2(10 BYTE),
  F1     VARCHAR2(10 BYTE),
  SDATE  DATE,
  EDATE  DATE
);
Insert into TST3
   (F0, F1, SDATE, EDATE)
 Values
   ('F0', 'F1', TO_DATE('03/01/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/02/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into TST3
   (F0, F1, SDATE, EDATE)
 Values
   ('F0', 'F1', TO_DATE('03/02/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/03/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into TST3
   (F0, F1, SDATE, EDATE)
 Values
   ('F0', 'F2', TO_DATE('03/03/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/04/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into TST3
   (F0, F1, SDATE, EDATE)
 Values
   ('F0', 'F1', TO_DATE('03/04/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/05/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into TST3
   (F0, F1, SDATE, EDATE)
 Values
   ('F0', 'F1', TO_DATE('03/05/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/08/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
Insert into TST3
   (F0, F1, SDATE, EDATE)
 Values
   ('F0', 'F1', TO_DATE('03/08/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), TO_DATE('03/09/2006 00:00:00', 'MM/DD/YYYY HH24:MI:SS'));
COMMIT;
Необходимо объединить периоды для которых комбинация f0,f1 неизменна
, тоесть

F0|F1|01.03.2006|03.03.2006
F0|F2|03.03.2006|04.03.2006
F0|F1|04.03.2006|09.03.2006
Понимаю, что вроде бы как аналитические ф-ции, но немогу понять, сделать такое окно .
Может кто подтолкнет в нужную сторону?
Исходная табличка - 60 млн


------------------------------
Not affilated with VAZ
16 мар 06, 13:30    [2455611]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Elic
Member

Откуда:
Сообщений: 30019
Калина
Необходимо объединить периоды для которых комбинация f0,f1 неизменна
select f0, f1, min(sdate) as sdate, max(edate) as edate
  from
  ( select f0, f1, sdate, edate, sum(start_of_group) over (partition by f0, f1 order by sdate) as group_no
      from
      ( select f0, f1, sdate, edate,
            decode(lag(edate) over (partition by f0, f1 order by sdate), sdate, 0, 1) as start_of_group
          from tst3
      )
  )
  group by f0, f1, group_no
  order by 3
;

F0         F1         SDATE             EDATE
---------- ---------- ----------------- -----------------
F0         F1         01.03.06 00:00:00 03.03.06 00:00:00
F0         F2         03.03.06 00:00:00 04.03.06 00:00:00
F0         F1         04.03.06 00:00:00 09.03.06 00:00:00
Калина
Исходная табличка - 60 млн
IMHO, умрёт.
16 мар 06, 13:51    [2455756]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Q u a d r o
Member

Откуда: Canada
Сообщений: 1987
SQL> select f0, f1, min(sdate) sdate, max(edate) edate
  2    from (
  3      select f0, f1, sdate, edate, sum(window_start) over (order by sdate) window
  4        from (
  5          select f0, f1, sdate, edate,
  6              case
  7                when lag(f0, 1, f0) over (order by sdate) != f0
  8                  or lag(f1, 1, f1) over (order by sdate) != f1 then 1 else 0
  9              end window_start
 10            from tst3)
 11    ) group by f0, f1, window
 12      order by sdate;

F0         F1         SDATE     EDATE
---------- ---------- --------- ---------
F0         F1         01-MAR-06 03-MAR-06
F0         F2         03-MAR-06 04-MAR-06
F0         F1         04-MAR-06 09-MAR-06
16 мар 06, 13:56    [2455789]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Jannny
Member

Откуда: Спб
Сообщений: 6424
Калина
Исходная табличка - 60 млн

А расскажите потом, что она скажет на группировку? Мы в свое время не добились, чтобы группировка такого же объема проходила. ;(
16 мар 06, 14:03    [2455840]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Калина
Member

Откуда: Moskau
Сообщений: 2649
Q u a d r o
SQL> select f0, f1, min(sdate) sdate, max(edate) edate
  2    from (
  3      select f0, f1, sdate, edate, sum(window_start) over (order by sdate) window
  4        from (
  5          select f0, f1, sdate, edate,
  6              case
  7                when lag(f0, 1, f0) over (order by sdate) != f0
  8                  or lag(f1, 1, f1) over (order by sdate) != f1 then 1 else 0
  9              end window_start
 10            from tst3)
 11    ) group by f0, f1, window
 12      order by sdate;

F0         F1         SDATE     EDATE
---------- ---------- --------- ---------
F0         F1         01-MAR-06 03-MAR-06
F0         F2         03-MAR-06 04-MAR-06
F0         F1         04-MAR-06 09-MAR-06


еще и полей типа f1,f0 - дофига, 13 штук
16 мар 06, 14:10    [2455886]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Q u a d r o
Member

Откуда: Canada
Сообщений: 1987
...может быть что-нибудь в этом духе?...
( если у Вас может быть "разделитель" )

select f0, f1, min(sdate) sdate, max(edate) edate
  from (
    select f0, f1, sdate, edate, sum(window_start) over (order by sdate) window
      from (
        select f0, f1, sdate, edate,
            case
              when lag(f0||'/'||f1, 1, f0||'/'||f1) over (order by sdate) != f0||'/'||f1
              then 1 else 0
            end window_start
          from tst3)
  ) group by f0, f1, window
    order by sdate;
16 мар 06, 14:20    [2455961]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Elic
Member

Откуда:
Сообщений: 30019
Q u a d r o
  6              case
  7                when lag(f0, 1, f0) over (order by sdate) != f0
  8                  or lag(f1, 1, f1) over (order by sdate) != f1 then 1 else 0
  9              end window_start
Кстати, этот вариант не совсем "объединяет периоды", т.к. на пропусках даёт ложное объединение.
16 мар 06, 15:32    [2456366]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
SY
Member

Откуда: Middlebury, CT USA
Сообщений: 10100
Since Исходная табличка - 60 млн, I would write a pipelined table function:

SQL> create or replace
  2    type tst3_obj_type
  3      is object(
  4                f0    varchar2(10 byte),
  5                f1    varchar2(10 byte),
  6                sdate date,
  7                edate date
  8               );
  9  /

Type created.

SQL> create or replace
  2    type tst3_tbl_type
  3      is table of tst3_obj_type;
  4  /

Type created.

SQL> create or replace
  2    function combine
  3      return tst3_tbl_type
  4      pipelined
  5      is
  6          cursor cur
  7            is
  8              select  *
  9                from  tst3
 10                order by f0,
 11                         f1,
 12                         sdate;
 13          comb_tst3 tst3_obj_type := tst3_obj_type(null,null,null,null);
 14          curr_tst3 tst3_obj_type := tst3_obj_type(null,null,null,null);
 15      begin
 16          open cur;
 17          fetch cur
 18            into comb_tst3.f0,
 19                 comb_tst3.f1,
 20                 comb_tst3.sdate,
 21                 comb_tst3.edate;
 22          if cur%found
 23            then
 24              loop
 25                fetch cur
 26                  into curr_tst3.f0,
 27                       curr_tst3.f1,
 28                       curr_tst3.sdate,
 29                       curr_tst3.edate;
 30                exit when cur%notfound;
 31                if     comb_tst3.f0 = curr_tst3.f0
 32                   and
 33                       comb_tst3.f1 = curr_tst3.f1
 34                   and
 35                       comb_tst3.edate = curr_tst3.sdate
 36                  then
 37                    comb_tst3.edate := curr_tst3.edate;
 38                else
 39                  pipe row(comb_tst3);
 40                  comb_tst3 := curr_tst3;
 41                end if;
 42              end loop;
 43              pipe row(comb_tst3);
 44          end if;
 45          close cur;
 46          return;
 47  end combine;
 48  /

Function created.

SQL> select  *
  2    from  table(combine)
  3  /

F0         F1         SDATE     EDATE
---------- ---------- --------- ---------
F0         F1         01-MAR-06 03-MAR-06
F0         F1         04-MAR-06 09-MAR-06
F0         F2         03-MAR-06 04-MAR-06

SQL> 

In such case we deal with one ORDER BY versus windowing/partitioning + group by + order by. And if there is an index on F0,F1,SDATE,EDATE it might be not that bad.

SY.
16 мар 06, 17:20    [2456984]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
SY
Member

Откуда: Middlebury, CT USA
Сообщений: 10100
One more thing - I assume intervals do not cross, otherwise it would be a bit more complex logic.

SY.
16 мар 06, 17:24    [2457023]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Q u a d r o
Member

Откуда: Canada
Сообщений: 1987
SY
I would write a pipelined table function


how about "test it"?

let's see how it goes.
17 мар 06, 02:58    [2458213]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
SY
Member

Откуда: Middlebury, CT USA
Сообщений: 10100
Q u a d r o
how about "test it"?


Well, I provided test results against Калина's small sample. If you mean volume testing, provide a test case and I'll gladly do it. In any case, as I already noted, PL/SQL solution uses a single pass through a sorted table comparing to window sort, group by and order by.

SY.
17 мар 06, 04:20    [2458235]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Q u a d r o
Member

Откуда: Canada
Сообщений: 1987
tkprof the analytics

tkprof the pipelined function

compare them

pipelined mean "write a lot of code" - we need to justify this one.
17 мар 06, 04:40    [2458240]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
SY
Member

Откуда: Middlebury, CT USA
Сообщений: 10100
Q u a d r o
tkprof the analytics

tkprof the pipelined function

compare them


You do not need tkprof to compare

select f0, f1, min(sdate) as sdate, max(edate) as edate
  from
  ( select f0, f1, sdate, edate, sum(start_of_group) over (partition by f0, f1 order by sdate) as group_no
      from
      ( select f0, f1, sdate, edate,
            decode(lag(edate) over (partition by f0, f1 order by sdate), sdate, 0, 1) as start_of_group
          from tst3
      )
  )
  group by f0, f1, group_no
  order by 3
;

and

            select  *
              from  tst3
              order by f0,
                       f1,
                       sdate;

And tkprof can not evaluate PL/SQL portion of table function.

Q u a d r o
pipelined mean "write a lot of code" - we need to justify this one.


If avoiding partitioninig+window sort and group by on 60,000,000 rows is not enough, then I do not know what justification you are looking for. And BTW, I do not see any complexity in that table function.

Anyway, below is my test:

SQL> truncate table tst3;

Table truncated.

Elapsed: 00:00:00.01
SQL> insert into tst3
  2  select 'f0','f1',trunc(sysdate) - rownum,trunc(sysdate) - rownum + 1 from dba_objects,dba_users
;

980463 rows created.

Elapsed: 00:01:21.02
SQL> commit;

Commit complete.

Elapsed: 00:00:00.00
SQL> select f0, f1, min(sdate) as sdate, max(edate) as edate
  2    from
  3    ( select f0, f1, sdate, edate, sum(start_of_group) over (partition by f0, f1 order by sdate) 
as group_no
  4        from
  5        ( select f0, f1, sdate, edate,
  6              decode(lag(edate) over (partition by f0, f1 order by sdate), sdate, 0, 1) as start_
of_group
  7            from tst3
  8        )
  9    )
 10    group by f0, f1, group_no
 11    order by 3
 12  ;

F0         F1         SDATE     EDATE
---------- ---------- --------- ---------
f0         f1         22-OCT-79 16-MAR-06

Elapsed: 00:02:22.04
SQL> select  *
  2    from  table(combine)
  3  /

F0         F1         SDATE     EDATE
---------- ---------- --------- ---------
f0         f1         22-OCT-79 16-MAR-06

Elapsed: 00:01:41.09
SQL> select f0, f1, min(sdate) as sdate, max(edate) as edate
  2    from
  3    ( select f0, f1, sdate, edate, sum(start_of_group) over (partition by f0, f1 order by sdate) 
as group_no
  4        from
  5        ( select f0, f1, sdate, edate,
  6              decode(lag(edate) over (partition by f0, f1 order by sdate), sdate, 0, 1) as start_
of_group
  7            from tst3
  8        )
  9    )
 10    group by f0, f1, group_no
 11    order by 3
 12  ;

F0         F1         SDATE     EDATE
---------- ---------- --------- ---------
f0         f1         22-OCT-79 16-MAR-06

Elapsed: 00:02:11.07
SQL> select  *
  2    from  table(combine)
  3  /

F0         F1         SDATE     EDATE
---------- ---------- --------- ---------
f0         f1         22-OCT-79 16-MAR-06

Elapsed: 00:01:40.01
SQL> 

SY.
17 мар 06, 05:53    [2458265]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Q u a d r o
Member

Откуда: Canada
Сообщений: 1987
SY
And tkprof can not evaluate PL/SQL portion of table function.


Yes it can??

Anyway, you have only one combination of f0 and f1 - not a fair comparsion

how about:

SQL> truncate table tst3;

Table truncated.

SQL> insert into tst3
  2   select mod(level, 1000),
  3     mod(level, 100),
  4     trunc(sysdate) - level, trunc(sysdate) - level + 1
  5    from dual
  6    connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> alter session set sql_trace=true;

Session altered.

SQL> select count(*) from
  2  (
  3  select f0, f1, min(sdate) as sdate, max(edate) as edate
  4    from
  5    ( select f0, f1, sdate, edate, sum(start_of_group) over (partition by f0, f1 order by sdate) 
as group_no
  6        from
  7        ( select f0, f1, sdate, edate,
  8              decode(lag(edate) over (partition by f0, f1 order by sdate), sdate, 0, 1) as start_
of_group
  9            from tst3
 10        )
 11    )
 12    group by f0, f1, group_no
 13    order by 3
 14  );

  COUNT(*)
----------
   1000000

SQL> select count(*) from table(combine);

  COUNT(*)
----------
   1000000

select count(*) 
from
 table(combine)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     31.40      32.79          0          0          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     31.40      32.79          0          0          0           1

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 61  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=1979 pr=0 pw=0 time=37253670 us)
1000000   COLLECTION ITERATOR PICKLER FETCH COMBINE (cr=1979 pr=0 pw=0 time=59936188 us)

select count(*) from
(
select f0, f1, min(sdate) as sdate, max(edate) as edate
  from
  ( select f0, f1, sdate, edate, sum(start_of_group) over (partition by f0, f1 order by sdate) as group_no
      from
      ( select f0, f1, sdate, edate,
            decode(lag(edate) over (partition by f0, f1 order by sdate), sdate, 0, 1) as start_of_group
          from tst3
      )
  )
  group by f0, f1, group_no
  order by 3
)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      9.88       9.64          0       1912          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      9.88       9.65          0       1913          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 61  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=1912 pr=0 pw=0 time=9647574 us)
1000000   VIEW  (cr=1912 pr=0 pw=0 time=12059974 us)
1000000    HASH GROUP BY (cr=1912 pr=0 pw=0 time=10059970 us)
1000000     VIEW  (cr=1912 pr=0 pw=0 time=11792996 us)
1000000      WINDOW BUFFER (cr=1912 pr=0 pw=0 time=8792992 us)
1000000       VIEW  (cr=1912 pr=0 pw=0 time=7898566 us)
1000000        WINDOW SORT (cr=1912 pr=0 pw=0 time=4898560 us)
1000000         TABLE ACCESS FULL TST3 (cr=1912 pr=0 pw=0 time=1000087 us)


pipelined is tad slower

so "it depends"
17 мар 06, 06:17    [2458282]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
SkyNet77
Guest
But "group by" clause is not neccessary, we can use "over(partition by " on external query instead of "group by" and decrease sort operations
17 мар 06, 08:44    [2458412]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Q u a d r o
Member

Откуда: Canada
Сообщений: 1987
benchmark it.

group by doesn't imply "sort", look above - hash group by - for you see
17 мар 06, 09:51    [2458605]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Elic
Member

Откуда:
Сообщений: 30019
2 SY & Q u a d r o
Спасибо за сравнение разных алгоритмов.
Лично мне интуитивно гораздо больше нравится (исключительно с точки зрения потенции в производительности) вариант SY. Более того, в нём имеется запас для оптимизации путём уменьшения переключений контекста при помощи bulk-fetch (а в 10-ке и вообще просто for-select-loop).

______________________________________________________________________

2 SkyNet77
Ты путаешь агрегацию с сортировкой.
17 мар 06, 10:01    [2458645]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Калина
Member

Откуда: Moskau
Сообщений: 2649
Мне лично вариант с Pipelined кажется не очень , но и SY и Elic - люди серьезные, в понедельник сделаю тестовый прогон на 1- млн , сравню все варианты
------------------------------
Not affilated with VAZ
17 мар 06, 10:22    [2458750]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
SeaGate
Member

Откуда: Новосибирск
Сообщений: 1636
Вот, вроде еще не предлагали, для десятки. Работает медленно, зато красиво выглядит :))
Connected to Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 
Connected as digger


SQL> create table tmp (db date,de date);

Table created

SQL> insert into tmp (select sysdate,sysdate+1 from dual union all select sysdate+1,sysdate+2 from dual union all select sysdate+3,sysdate+4 from dual); 

3 rows inserted

SQL> commit;

Commit complete

SQL> select min(connect_by_root(t.db)) db,
  2  t.de
  3  from tmp t
  4  where connect_by_isleaf = 1
  5  connect by t.db = prior t.de
  6  group by t.de
  7  order by db;

DB          DE
----------- -----------
17.03.2006  19.03.2006
20.03.2006  21.03.2006
17 мар 06, 14:22    [2460347]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
SY
Member

Откуда: Middlebury, CT USA
Сообщений: 10100
Well, I tried Quadro's test on my laptop:

Dell Inspirion I1150
Pentium 4
CPU 2.8GHz
RAM 512MB
And I came up with opposite results:

SQL*Plus: Release 10.2.0.1.0 - Production on Fri Mar 17 09:30:47 2006

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> set timing on
SQL> connect scott/tiger
Connected.
SQL> drop table tst3;

Table dropped.

Elapsed: 00:00:01.99
SQL> CREATE TABLE TST3
  2  (
  3    F0     VARCHAR2(10 BYTE),
  4    F1     VARCHAR2(10 BYTE),
  5    SDATE  DATE,
  6    EDATE  DATE
  7  );

Table created.

Elapsed: 00:00:00.30
SQL> insert into tst3
  2   select mod(level, 1000),
  3     mod(level, 100),
  4     trunc(sysdate) - level, trunc(sysdate) - level + 1
  5    from dual
  6    connect by level <= 1000000;

1000000 rows created.

Elapsed: 00:00:09.97
SQL> commit;

Commit complete.

Elapsed: 00:00:00.02
SQL> connect / as sysdba
Connected.
SQL> shutdown
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area  167772160 bytes
Fixed Size                  1247900 bytes
Variable Size              75498852 bytes
Database Buffers           88080384 bytes
Redo Buffers                2945024 bytes
Database mounted.
Database opened.
SQL> connect scott/tiger
Connected.
SQL> alter session set sql_trace=true;

Session altered.

Elapsed: 00:00:00.05
SQL> select count(*) from (
  2  select f0, f1, min(sdate) as sdate, max(edate) as edate
  3    from
  4    ( select f0, f1, sdate, edate, sum(start_of_group) over (partition by f0, f1 order by sdate) 
as group_no
  5        from
  6        ( select f0, f1, sdate, edate,
  7              decode(lag(edate) over (partition by f0, f1 order by sdate), sdate, 0, 1) as start_
of_group
  8            from tst3
  9        )
 10    )
 11    group by f0, f1, group_no
 12    order by 3
 13  )
 14  ;

  COUNT(*)
----------
   1000000

Elapsed: 00:01:25.70
SQL> connect / as sysdba
Connected.
SQL> shutdown
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area  167772160 bytes
Fixed Size                  1247900 bytes
Variable Size              75498852 bytes
Database Buffers           88080384 bytes
Redo Buffers                2945024 bytes
Database mounted.
Database opened.
SQL> connect scott/tiger
Connected.
SQL> alter session set sql_trace=true;

Session altered.

Elapsed: 00:00:00.03
SQL> select count(*) from (select * from table(combine));

  COUNT(*)
----------
   1000000

Elapsed: 00:00:58.87
SQL> connect scott/tiger
Connected.
SQL> drop table tst3;

Table dropped.

Elapsed: 00:00:01.96
SQL> CREATE TABLE TST3
  2  (
  3    F0     VARCHAR2(10 BYTE),
  4    F1     VARCHAR2(10 BYTE),
  5    SDATE  DATE,
  6    EDATE  DATE
  7  );

Table created.

Elapsed: 00:00:00.29
SQL> insert into tst3
  2   select mod(level, 1000),
  3     mod(level, 100),
  4     trunc(sysdate) - level, trunc(sysdate) - level + 1
  5    from dual
  6    connect by level <= 1000000;

1000000 rows created.

Elapsed: 00:00:08.63
SQL> commit;

Commit complete.

Elapsed: 00:00:00.02
SQL> exec dbms_stats.gather_table_stats('SCOTT','TST3');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.95
SQL> connect / as sysdba
Connected.
SQL> shutdown
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area  167772160 bytes
Fixed Size                  1247900 bytes
Variable Size              75498852 bytes
Database Buffers           88080384 bytes
Redo Buffers                2945024 bytes
Database mounted.
Database opened.
SQL> connect scott/tiger
Connected.
SQL> alter session set sql_trace=true;

Session altered.

Elapsed: 00:00:00.03
SQL> select count(*) from (
  2  select f0, f1, min(sdate) as sdate, max(edate) as edate
  3    from
  4    ( select f0, f1, sdate, edate, sum(start_of_group) over (partition by f0, f1 order by sdate) 
as group_no
  5        from
  6        ( select f0, f1, sdate, edate,
  7              decode(lag(edate) over (partition by f0, f1 order by sdate), sdate, 0, 1) as start_
of_group
  8            from tst3
  9        )
 10    )
 11    group by f0, f1, group_no
 12    order by 3
 13  )
 14  ;

  COUNT(*)
----------
   1000000

Elapsed: 00:01:17.99
SQL> connect / as sysdba
Connected.
SQL> shutdown
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area  167772160 bytes
Fixed Size                  1247900 bytes
Variable Size              75498852 bytes
Database Buffers           88080384 bytes
Redo Buffers                2945024 bytes
Database mounted.
Database opened.
SQL> connect scott/tiger
Connected.
SQL> alter session set sql_trace=true;

Session altered.

Elapsed: 00:00:00.04
SQL> set timing on
SQL> select count(*) from (select * from table(combine));

  COUNT(*)
----------
   1000000

Elapsed: 00:00:57.36
SQL> connect scott/tiger
Connected.
SQL> 

As you can see, with or without stats, pipelined table function is faster. Below are TKPROF results:


 ********************************************************************************

select count(*) from (
select f0, f1, min(sdate) as sdate, max(edate) as edate
  from
  ( select f0, f1, sdate, edate, sum(start_of_group) over (partition by f0, f1 order by sdate) as group_no
      from
      ( select f0, f1, sdate, edate,
            decode(lag(edate) over (partition by f0, f1 order by sdate), sdate, 0, 1) as start_of_group
          from tst3
      )
  )
  group by f0, f1, group_no
  order by 3
)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.02       0.06          2          2          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     16.53      81.90      32505       6552         95           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     16.55      81.97      32507       6554         95           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 54  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=6552 pr=32505 pw=28932 time=81908378 us)
1000000   VIEW  (cr=6552 pr=32505 pw=28932 time=78197087 us)
1000000    HASH GROUP BY (cr=6552 pr=32505 pw=28932 time=73197078 us)
1000000     VIEW  (cr=6552 pr=27332 pw=23759 time=70310715 us)
1000000      WINDOW BUFFER (cr=6552 pr=27332 pw=23759 time=66310704 us)
1000000       VIEW  (cr=6552 pr=22856 pw=19299 time=66602881 us)
1000000        WINDOW SORT (cr=6552 pr=22856 pw=19299 time=61602868 us)
1000000         TABLE ACCESS FULL TST3 (cr=6552 pr=3556 pw=0 time=8020228 us)




********************************************************************************


********************************************************************************

select count(*) 
from
 (select * from table(combine))


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.04       0.06          1        255          1           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     27.95      30.00          0          0          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     27.99      30.06          1        255          1           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 54  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=4033 pr=7811 pw=3974 time=55452110 us)
1000000   COLLECTION ITERATOR PICKLER FETCH COMBINE (cr=4033 pr=7811 pw=3974 time=52225624 us)

********************************************************************************

********************************************************************************

SELECT * 
FROM
 TST3 ORDER BY F0, F1, SDATE


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.03          1          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch   1000001      8.01      23.41       7351       3841         27     1000000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   1000003      8.02      23.45       7352       3842         27     1000000

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 54     (recursive depth: 1)



********************************************************************************


********************************************************************************

select count(*) from (
select f0, f1, min(sdate) as sdate, max(edate) as edate
  from
  ( select f0, f1, sdate, edate, sum(start_of_group) over (partition by f0, f1 order by sdate) as group_no
      from
      ( select f0, f1, sdate, edate,
            decode(lag(edate) over (partition by f0, f1 order by sdate), sdate, 0, 1) as start_of_group
          from tst3
      )
  )
  group by f0, f1, group_no
  order by 3
)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.01          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     16.67      77.30      36463       3841         94           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     16.68      77.31      36463       3841         94           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 54  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=3841 pr=36463 pw=32611 time=77304018 us)
1000000   VIEW  (cr=3841 pr=36463 pw=32611 time=76480138 us)
1000000    HASH GROUP BY (cr=3841 pr=36463 pw=32611 time=71480129 us)
1000000     VIEW  (cr=3841 pr=27223 pw=23371 time=67697064 us)
1000000      WINDOW BUFFER (cr=3841 pr=27223 pw=23371 time=62697054 us)
1000000       VIEW  (cr=3841 pr=22747 pw=18911 time=64215930 us)
1000000        WINDOW SORT (cr=3841 pr=22747 pw=18911 time=59215918 us)
1000000         TABLE ACCESS FULL TST3 (cr=3841 pr=3835 pw=0 time=21089208 us)




********************************************************************************

********************************************************************************

select count(*) 
from
 (select * from table(combine))


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.03       0.08          1        255          1           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     27.28      29.05          0          0          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     27.31      29.13          1        255          1           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 54  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=3967 pr=7906 pw=4066 time=36384922 us)
1000000   COLLECTION ITERATOR PICKLER FETCH COMBINE (cr=3967 pr=7906 pw=4066 time=46619098 us)

********************************************************************************

********************************************************************************

SELECT * 
FROM
 TST3 ORDER BY F0, F1, SDATE


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch   1000001      8.23      24.23       7901       3841         34     1000000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   1000003      8.23      24.24       7901       3841         34     1000000

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 54     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
1000000  SORT ORDER BY (cr=3841 pr=7901 pw=4066 time=11519767 us)
1000000   TABLE ACCESS FULL TST3 (cr=3841 pr=3835 pw=0 time=4037466 us)




********************************************************************************
17 мар 06, 18:21    [2461895]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Q u a d r o
Member

Откуда: Canada
Сообщений: 1987
Well...

you have a little low workarea size :-)

see - here is the problem:

SELECT * 
FROM
 TST3 ORDER BY F0, F1, SDATE


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch   1000001      8.23      24.23       7901       3841         34     1000000
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   1000003      8.23      24.24       7901       3841         34     1000000

7901 reads from disk

select count(*) from (
select f0, f1, min(sdate) as sdate, max(edate) as edate
  from
  ( select f0, f1, sdate, edate, sum(start_of_group) over (partition by f0, f1 order by sdate) as group_no
      from
      ( select f0, f1, sdate, edate,
            decode(lag(edate) over (partition by f0, f1 order by sdate), sdate, 0, 1) as start_of_group
          from tst3
      )
  )
  group by f0, f1, group_no
  order by 3
)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.01          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     16.67      77.30      36463       3841         94           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     16.68      77.31      36463       3841         94           1

36463 reads from disk - you are swapping to temp like mad in this test

see above - in my tests the disk was zero.

analytics grabs more memory for hashing/sorting - and you don't have enough.

Definitley it is not for laptops with 512 megs of ram :-)
19 мар 06, 03:18    [2464012]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
SY
Member

Откуда: Middlebury, CT USA
Сообщений: 10100
Q u a d r o
36463 reads from disk - you are swapping to temp like mad in this test.
Definitley it is not for laptops with 512 megs of ram :-)


Absolutely. And it is, IMHO, good since if reflects (don't know how accurate though) scalability issue. In this case table has 1,000,000 rows while real one will have 60 times more. So even if just 1GB of memory takes care of 1,000,000 rows 60 million rows would need a bit more than that :).

Q u a d r o
see above - in my tests the disk was zero.


Which tells me your test was not 100% cosher - you should have at least as many "disk" as it takes to read the table itself. Most likely you executed SQL more than once, so table data was already cached.

SY.
19 мар 06, 16:43    [2464514]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
Q u a d r o
Member

Откуда: Canada
Сообщений: 1987
SY
good since if reflects (don't know how accurate though) scalability issue


I think one would better say: analytics is a tradeoff - they trade memory for speed

SY
you should have at least as many "disk" as it takes to read the table itself


no i should not - given i had buffer cache large enought

you do realize that

SQL> insert into tst3
  2   select mod(level, 1000),
  3    mod(level, 100),
  4    trunc(sysdate) - level, trunc(sysdate) - level + 1
  5   from dual
  6   connect by level <= 1000000;

is a conventional insert? It will already cache blocks for us.

consider:

SQL> truncate table tst3;

Table truncated.

SQL> insert into tst3
  2   select mod(level, 1000),
  3    mod(level, 100),
  4    trunc(sysdate) - level, trunc(sysdate) - level + 1
  5   from dual
  6   connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> alter session set sql_trace=true;

Session altered.

SQL> select count(*) from tst3;

  COUNT(*)
----------
   1000000

and the tkprof

select count(*) 
from
 tst3


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.14       0.14          0       1912          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.15       0.14          0       1913          0           1


zero PIO - for you see. Data was already cached for us by insert.


btw - in 10GR2 you don't need to bounce as there is alter system flush buffer_cache.
20 мар 06, 02:40    [2465406]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
SY
Member

Откуда: Middlebury, CT USA
Сообщений: 10100
Q u a d r o
no i should not - given i had buffer cache large enought

you do realize that
It will already cache blocks for us.


I do realize it. And that was exactly what I talking about. In real test there will not be a preceding insert, select or any other case that caches table. And taking into consideration 60,000,000 rows and multiuser system I would not count on 0 "disk".

SY.
20 мар 06, 17:18    [2468465]     Ответить | Цитировать Сообщить модератору
 Re: Периоды, объединение  [new]
4uko
Member

Откуда:
Сообщений: 276
Jannny
Калина
Исходная табличка - 60 млн

А расскажите потом, что она скажет на группировку? Мы в свое время не добились, чтобы группировка такого же объема проходила. ;(


Мне тоже интересно. Кто нибудь ответит ?
9 окт 06, 16:51    [3238194]     Ответить | Цитировать Сообщить модератору
Топик располагается на нескольких страницах: [1] 2   вперед  Ctrl      все
Все форумы / Oracle Ответить