it-swarm.com.de

Generieren Sie 1 Million Zeilen und fügen Sie sie in eine einfache Tabelle ein

Beschreibung :

Ich versuche, 1 Million Zeilen in eine leere Tabelle in MSSQL 2012 Express einzufügen. Hier mein Skript:

-- set statistics time off
drop table t1
create table t1 (id int, a text, b text) 
go

-- #1 - 1,000,000 - 30s -> 45s
with ID(number) as
(
    select 1 as number
    union all
    select number + 1
    from ID
    where number < 1000000 + 1
)
insert into t1
    select number, 'a_' + cast (number as varchar), 'b_' + cast (number/2 as varchar)
    from ID  
    option(maxrecursion 0)


-- #2 - 1 million rows => ~140,000 rows = 120s (have to cancel query)
declare @count int
set @count = 0
while @count < 1000000
begin
    set @count = @count + 1
    insert into t1 
        values(@count, 'a_' + cast (@count as varchar), 'b_' + cast (@count/2 as varchar))
end

-- #3 - ~1,300,000 rows - 18s -> 20s  

with temp as 
(
    SELECT  ROW_NUMBER() OVER(ORDER BY a.object_id) as tcount 
    from sys.all_columns a,  sys.all_columns b
    where a.object_id = b.object_id  
) 
insert into t1
    select tcount, 'a_' + cast (tcount as varchar), 'b_' + cast (tcount/2 as varchar) 
    from temp 
go

declare @count int
set @count = 0
while @count < 3
begin
    with temp as (select max(id) + 1 as max_id from t1)
    insert into t1
        select max_id, 'a_' + cast (max_id as varchar), 'b_' + cast (max_id/2 as varchar) 
        from t1, temp 
    set @count = @count + 1
end

-- #4 -- 1,000,000 = 3s -> 4s (have to drop t1 first)
with a(k) as
(
select 1 as k
union all
select k + 1 from a where k < 99 + 1
) , 
t2 as (
select row_number() over(order by x.k) as k
from a x , a y , a z 
) 
select k as id , 'a_' + cast (k as varchar) as a, 'b_' + cast (k/2 as varchar) as b into t1
from t2

Frage :

Nach Recherchen habe ich 4 Lösungen gefunden. Gibt es eine bessere Lösung (keine Kopierdaten aus Dateien verwenden)?

7
Luan Huynh

Itzik Ben-Gan benutzt folgender Ansatz Dies ist wahrscheinlich der schnellste Weg, den er gefunden hat und er ist ziemlich schlau :-)

WITH
  L0   AS (SELECT c FROM (SELECT 1 UNION ALL SELECT 1) AS D(c)), -- 2^1
  L1   AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),       -- 2^2
  L2   AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),       -- 2^4
  L3   AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),       -- 2^8
  L4   AS (SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),       -- 2^16
  L5   AS (SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),       -- 2^32
  Nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS k FROM L5)

select k as id , 'a_' + cast (k as varchar) as a, 'b_' + cast (k/2 as varchar) as b into t1
from nums
where k <= 1000000
15
dnoeth

Eine Variation von dnoeths Antwort :

WITH Ten(N) AS 
(
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)   
SELECT
    id = IDENTITY(int, 1, 1)
INTO dbo.T1
FROM Ten T10
CROSS JOIN Ten T100
CROSS JOIN Ten T1000
CROSS JOIN Ten T10000
CROSS JOIN Ten T100000
CROSS JOIN Ten T1000000;

ALTER TABLE dbo.T1
ADD a AS CONVERT(varchar(11), id);

ALTER TABLE dbo.T1
ADD b AS CONVERT(varchar(11), id / 2);

Dies vermeidet das Speichern der Werte von a und b; Ihre Werte werden nach Bedarf zur Laufzeit berechnet. Dies mag leicht betrügen, hat aber Vorteile:

  • Kein Speicherplatz für die Spalten a und b
  • Die Spalte id wird direkt als Ganzzahl (4 Byte unkomprimiert) eingegeben. wohingegen ROW_NUMBER gibt bigint zurück (8 Bytes unkomprimiert).
  • Der Spalte id wird die Identitätseigenschaft zugewiesen, sodass sie nicht aktualisiert werden kann.

Alternativ können Sie alle Spalten in der Tabelle speichern:

WITH Ten(N) AS 
(
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)   
SELECT
    id = CONVERT(integer, ROW_NUMBER() OVER (ORDER BY T10.N)),
    a = CONVERT(varchar(11), ROW_NUMBER() OVER (ORDER BY T10.N)),
    b = CONVERT(varchar(11), ROW_NUMBER() OVER (ORDER BY T10.N) / 2)
INTO dbo.T1
FROM Ten T10
CROSS JOIN Ten T100
CROSS JOIN Ten T1000
CROSS JOIN Ten T10000
CROSS JOIN Ten T100000
CROSS JOIN Ten T1000000;

Beachten Sie die Konvertierung in eine Ganzzahl in der Spalte id und die Verwendung einer bestimmten Länge in der Spalte varchar Typen. Sehen:

Schlechte Gewohnheiten zu treten: VARCHAR ohne (Länge) deklarieren von Aaron Bertrand

7
Paul White 9

Methode 1: @dnoeth oben, Einfügezeit: 1077 ms - 1180 ms (10-maliges Testen)

Methode 2: Ich versuche mit dieser Methode einzufügen, Einfügezeit 989ms -> 1132ms
Es ist einfach .

select t1.k as id , 'a_' + cast (t1.k as varchar) as a, 'b_' + cast (t1.k/2 as varchar) as b  into t1
from ( 
SELECT  ROW_NUMBER() OVER(ORDER BY a.object_id) as k 
from sys.all_columns, sys.all_columns a ) t1
where t1.k < 1000001

Methode 3: Nach Paul Whites Idee 450 ms

with x1 as (select top 1000 object_id from sys.all_columns )
SELECT  id = IDENTITY(int, 1, 1) into t1
from x1 a, x1 b
ALTER TABLE dbo.T1 ADD a AS 'a_' + CONVERT(varchar(20),  id);
ALTER TABLE dbo.T1 ADD b AS  'b_' + CONVERT(varchar(20),  id / 2);
3
Luan Huynh

Eine weitere Variation von Dnoeths Antwort:

WITH
L0   AS (SELECT c FROM (SELECT 1 UNION ALL SELECT 1 UNION ALL 
                        SELECT 1 UNION ALL SELECT 1 UNION ALL 
                        SELECT 1 UNION ALL SELECT 1) AS D(c)), -- 6^1
L1   AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),       -- 6^2
L2   AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),       -- 6^4
L3   AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),       -- 6^8
Nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS k FROM L3)

select k as id , 'a_' + cast (k as varchar) as a, 'b_' + cast (k/2 as varchar) as b into t1
from nums
where k <= 1000000

Es ist viel effizienter, 6 als Basis zu verwenden, da 6 ^ 8 (1 679 616) viel näher an 1000000 liegt als 2 ^ 32 (4 294 967 296).

2
matgul