How can I do ‘insert if not exists’ in MySQL?

Question 1

I started by googling and found the article How to write INSERT if NOT EXISTS queries in standard SQL which talks about mutex tables.

I have a table with ~14 million records. If I want to add more data in the same format, is there a way to ensure the record I want to insert does not already exist without using a pair of queries (i.e., one query to check and one to insert is the result set is empty)?

Does a unique constraint on a field guarantee the insert will fail if it’s already there?

It seems that with merely a constraint, when I issue the insert via PHP, the script croaks.

Question 2

Use INSERT IGNORE INTO table.

There’s also INSERT … ON DUPLICATE KEY UPDATE syntax, and you can find explanations in 13.2.6.2 INSERT … ON DUPLICATE KEY UPDATE Statement.

Post from bogdan.org.ua according to Google’s webcache:

18th October 2007

To start: as of the latest MySQL, syntax presented in the title is not
possible. But there are several very easy ways to accomplish what is
expected using existing functionality.

There are 3 possible solutions: using INSERT IGNORE, REPLACE, or
INSERT … ON DUPLICATE KEY UPDATE.

Imagine we have a table:
CREATE TABLE `transcripts` (
`ensembl_transcript_id` varchar(20) NOT NULL,
`transcript_chrom_start` int(10) unsigned NOT NULL,
`transcript_chrom_end` int(10) unsigned NOT NULL,
PRIMARY KEY (`ensembl_transcript_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Now imagine that we have an automatic pipeline importing transcripts
meta-data from Ensembl, and that due to various reasons the pipeline
might be broken at any step of execution. Thus, we need to ensure two
things:

repeated executions of the pipeline will not destroy our
> database

repeated executions will not die due to ‘duplicate
> primary key’ errors.

Method 1: using REPLACE

It’s very simple:
REPLACE INTO `transcripts`
SET `ensembl_transcript_id` = 'ENSORGT00000000001',
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;
If the record exists, it will be overwritten; if it does not yet
exist, it will be created. However, using this method isn’t efficient
for our case: we do not need to overwrite existing records, it’s fine
just to skip them.

Method 2: using INSERT IGNORE Also very simple:
INSERT IGNORE INTO `transcripts`
SET `ensembl_transcript_id` = 'ENSORGT00000000001',
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;
Here, if the ‘ensembl_transcript_id’ is already present in the
database, it will be silently skipped (ignored). (To be more precise,
here’s a quote from MySQL reference manual: “If you use the IGNORE
keyword, errors that occur while executing the INSERT statement are
treated as warnings instead. For example, without IGNORE, a row that
duplicates an existing UNIQUE index or PRIMARY KEY value in the table
causes a duplicate-key error and the statement is aborted.”.) If the
record doesn’t yet exist, it will be created.

This second method has several potential weaknesses, including
non-abortion of the query in case any other problem occurs (see the
manual). Thus it should be used if previously tested without the
IGNORE keyword.

Method 3: using INSERT … ON DUPLICATE KEY UPDATE:

Third option is to use INSERT … ON DUPLICATE KEY UPDATE
syntax, and in the UPDATE part just do nothing do some meaningless
(empty) operation, like calculating 0+0 (Geoffray suggests doing the
id=id assignment for the MySQL optimization engine to ignore this
operation). Advantage of this method is that it only ignores duplicate
key events, and still aborts on other errors.

As a final notice: this post was inspired by Xaprb. I’d also advise to
consult his other post on writing flexible SQL queries.

1Best Answer
11

Leave a Reply Cancel reply

1Best Answer 11

Leave a Reply Cancel reply

1Best Answer
11