ASE使用with ignore_dup_row删除重复数据
之前博文中介绍了ASE中使用union来删除重复数据的方法:Sybase去除重复数据的一种方法!用union的方法要求先把不重复的数据行插入到临时表中,然后清空源表后再把数据倒回到源表中。
本文介绍使用索引的with ignore_dup_row选项来在线删除表的重复数据行。
创建非唯一、聚簇索引,带选项 : with ignore_dup_row
在索引的创建语法中,unique 和 with ignore_dup_row 是互斥的,nonclustered 和 with ignore_dup_row 是互斥的;也就是说,使用with ignore_dup_row选项时索引必须是非唯一、聚簇的。
通过索引删除重复数据后,再把新建的非唯一、聚簇索引删除。
下面演示创建过程及注意问题:
--创建表
create table DeleteDuplicateRow(id int
, name varchar(30)
, begin_time datetime
)
go
create index idx_DeleteDuplicateRow on DeleteDuplicateRow(id , name)
go--插入测试数据
insert into DeleteDuplicateRow values(1,'www.dbainfo.net','2014-5-5 14:30')
go
insert into DeleteDuplicateRow values(2,'Sybase数据库技术,数据库恢复','2014-8-17 1:20')
go
insert into DeleteDuplicateRow values(3,'分享Sybase数据库知识','2014-9-11 8:54')
go1> select * from DeleteDuplicateRow
2> go
id name begin_time
----------- ------------------------------ -------------------------------
1 www.dbainfo.net May 5 2014 2:30PM
2 Sybase数据库技术,数据庠 Aug 17 2014 1:20AM
3 分享Sybase数据库知识 Sep 11 2014 8:54AM(3 rows affected)
--从自身循环插入2次
1> insert into DeleteDuplicateRow
2> select * from DeleteDuplicateRow
3> go 2--总行数:12
1> select count(*) from DeleteDuplicateRow
2> go--因为ASE要求一张表上最多只能有一个聚簇索引,所以先检查表上是否存在聚簇索引:
1> sp_helpindex DeleteDuplicateRow
2> go
Object has the following indexes
index_name index_keys index_description index_max_rows_per_page index_fillfactor index_reservepagegap index_created index_local
---------------------- ---------- ----------------- ----------------------- ---------------- -------------------- ------------------- ------------
idx_DeleteDuplicateRow id, name nonclustered 0 0 0 Dec 9 2014 11:10PM Global Index(1 row affected)
index_ptn_name index_ptn_seg
-------------------------------- -------------
idx_DeleteDuplicateRow_766902983 default(1 row affected)
--发现表上只有一个非唯一、非聚簇索引:idx_DeleteDuplicateRow,可以不管这个索引。--如果使用with ignore_dup_row时不加clustered会报错:
1> create index clu_DeleteDuplicateRow on DeleteDuplicateRow(id , name , begin_time) with ignore_dup_row
2> go
Msg 1916, Level 16, State 3:
Server 'CENTOS5', Line 1:
CREATE INDEX options nonclustered and ignore_dup_row are mutually exclusive.--创建表DeleteDuplicateRow上所有列的非唯一、聚簇索引:clu_DeleteDuplicateRow
1> create clustered index clu_DeleteDuplicateRow on DeleteDuplicateRow(id , name , begin_time) with ignore_dup_row
2> go
Warning: deleted duplicate row. Primary key is '1, "www.dbainfo.net", "May 5 2014 2:30:00:000PM"'
Warning: deleted duplicate row. Primary key is '1, "www.dbainfo.net", "May 5 2014 2:30:00:000PM"'
Warning: deleted duplicate row. Primary key is '1, "www.dbainfo.net", "May 5 2014 2:30:00:000PM"'
Warning: deleted duplicate row. Primary key is '2, "Sybase数据库技术,数据庢, "Aug 17 2014 1:20:00:000AM"'
Warning: deleted duplicate row. Primary key is '2, "Sybase数据库技术,数据庢, "Aug 17 2014 1:20:00:000AM"'
Warning: deleted duplicate row. Primary key is '2, "Sybase数据库技术,数据庢, "Aug 17 2014 1:20:00:000AM"'
Warning: deleted duplicate row. Primary key is '3, "分享Sybase数据库知识", "Sep 11 2014 8:54:00:000AM"'
Warning: deleted duplicate row. Primary key is '3, "分享Sybase数据库知识", "Sep 11 2014 8:54:00:000AM"'
Warning: deleted duplicate row. Primary key is '3, "分享Sybase数据库知识", "Sep 11 2014 8:54:00:000AM"'
Non-clustered index (index id = 2) is being rebuilt.--查看表DeleteDuplicateRow上的索引情况,包含:非唯一、非聚簇索引,非唯一、聚簇索引
1> sp_helpindex DeleteDuplicateRow
2> go
Object has the following indexes
index_name index_keys index_description index_max_rows_per_page index_fillfactor index_reservepagegap index_created index_local
---------------------- --------------------- -------------------------------- ----------------------- ---------------- -------------------- ------------------- ------------
clu_DeleteDuplicateRow id, name, begin_time clustered, ignore duplicate rows 0 0 0 Dec 9 2014 11:46PM Global Index
idx_DeleteDuplicateRow id, name nonclustered 0 0 0 Dec 9 2014 11:10PM Global Index(2 rows affected)
index_ptn_name index_ptn_seg
--------------------------------- -------------
clu_DeleteDuplicateRow_1230904636 default
idx_DeleteDuplicateRow_1230904636 default(2 rows affected)
(return status = 0)--删除刚才创建的非唯一、聚簇索引:clu_DeleteDuplicateRow
1>drop index DeleteDuplicateRow.clu_DeleteDuplicateRow
2>go--查看表DeleteDuplicateRow上的索引情况
1> sp_helpindex DeleteDuplicateRow
2> go
Object has the following indexes
index_name index_keys index_description index_max_rows_per_page index_fillfactor index_reservepagegap index_created index_local
---------------------- ---------- ----------------- ----------------------- ---------------- -------------------- ------------------- ------------
idx_DeleteDuplicateRow id, name nonclustered 0 0 0 Dec 9 2014 11:10PM Global Index(1 row affected)
index_ptn_name index_ptn_seg
--------------------------------- -------------
idx_DeleteDuplicateRow_1230904636 default(1 row affected)
(return status = 0)--最后验证一下删除重复行之后的数据
1> select * from DeleteDuplicateRow
2> go
id name begin_time
----------- ------------------------------ -------------------------------
1 www.dbainfo.net May 5 2014 2:30PM
2 Sybase数据库技术,数据庠 Aug 17 2014 1:20AM
3 分享Sybase数据库知识 Sep 11 2014 8:54AM(3 rows affected)
--EOF--
照着走了一遍。