When inserting records into an SQL database using a existing table as a  source usually means we need to know and specify the column structure  of the target table. If we are trying to do this using a stored  procedure then our script must be changed every-time the table's schema  changes.

Here is a way to simplify this job of duplicating records in a Postgres database.

The Use Case

Suppose you have a database with 2 tables.

  • A library table containing details of different book libraries
  • A book table containing references to actual books

So we know all the books in a library and which library has each book.

Now suppose we decide to add a new library and populate it with the  same books in an existing library. We could iterate through a select on  the books

SELECT * FROM books WHERE id=oldid;

where oldid is the id of the existing library.

And for each record, we want to add a new record using newid as the id of the new library. This is quite awkward; all we want to do  is replace the old id with the new but we have to unpack all the columns  of the books table and repacked them in a new insert; something like

field1 = book.field1
field2 = book.field2
field3 = book.field3
  ...
INSERT INTO books VALUES (newid, field1, field2, field3,.....);

This is quite tedious and if we want to achieve this programmatically  means we have to specify the table structure of the books table is such  a way that if the schema changes, we have to remember to go back and  change this script as well.

But there is a better way:

The Solution

Instead of stating the fields explicitly, we can use the

INSERT INTO table SELECT *

form provided we can figure a way of replacing the library id. The following plpgsql script illustrates how to do this:

CREATE FUNCTION duplicatebooks (oldid INTEGER, newid INTEGER)
RETURNS void
AS $$
DECLARE
   book books;
BEGIN
   FOR book IN SELECT * FROM books WHERE id=oldid LOOP
     book.id = newid;
     INSERT INTO BOOKS SELECT book.*;
   END LOOP;
 END;
$$ LANGUAGE plpgsql;

In this script:

  • we pass the ids of the libraries, oldid and newid, as parameters to the function
  • declare a book RECORD variable to contain data on each book
  • iterate over the books we want to duplicate
  • change the id of the book record to match the new library
  • and finally add the new book record

In this way, we do not even need to know the other fields in the book table

Problems

In principle, we could have used a generic RECORD variable when  iterating over the books table but unfortunately trying to pass a  generic RECORD to the INSERT statement is explicitly disallowed and  results in the error:

ERROR: record type has not been registered

The solution is to just declare a type for the record at the outset. So instead of having this fragment (which fails):

DECLARE
  book RECORD;
BEGIN
  FOR book IN SELECT * FROM books WHERE id=oldid LOOP
    book.id = newid;
    INSERT INTO books SELECT book.*;

we do this instead

DECLARE
  book books;
BEGIN
  FOR book IN SELECT * FROM books WHERE id=oldid LOOP
    book.id = newid;
    INSERT INTO books SELECT book.*;