PostgreSQL: Perl procedures with PL/pgSQL.

By Mark Nielsen

Introduction
Downloading and installing Perl.
Downloading and installing PostgreSQL with Perl.
Example perl/sql commands.
Setting up the tables and pl/perl procedures for the Insert, Update, and Delete pl/pgsql procedures
Insert pl/pgsql procedure
Update pl/pgsql procedure
Delete pl/pgsql procedure
Considerations to explore.
Conclusion
References

Other Notes

Rod Pall corrected me, and figured out how to make it so you DON'T have to recompile perl. You can use you existing Perl for PostgreSQL. Here are his notes, and I verified that is worked.

-----------------------------------------------------------------------

I found something pretty neat, that you don't need to compile perl dynamically. I didn't want to recompile perl, because I already had 5.6.1 installed and I had set up so many options manually that the thought of redoing all that work was not very appealing.

You can compile with libperl.a, the code archive instead of the shared object code. When you run configure on postgresql (tested on 7.1.3), just give the argument
--with-libraries=$LIBPERLA
where LIBPERLA for me was /usr/lib/5.6.1/i686-linux-ld/CORE/libperl.a

Then in the PL/Perl makefile just erase the check for a dynamic perl, lines 12-30. This file is located at
$POSTGRESQL_UNPACKED/src/pl/plperl/Makefile.PL
(tested on 7.1.3)

This works because if you read the comments preceding that check in the makefile, it says this:
# Can't build a shared plperl unless libperl is shared too.
# (Actually, it would be enough if code in libperl.a is compiled
# to be position-independent, but that is hard to check for and
# seems pretty unlikely anyway.)

I don't remember if I compiled that code to be position-independent, but I do know that my pl/perl functions work, as I just tested a few out.
----------------------------------------------------------------------

Introduction

PostgreSQL has come a long way with version 7.1. I have been waiting for better handling of large objects. In earlier versions, there was the size limit of 32k for a field in a table. Otherwise, you had to use a cumbersome way of manipulating large objects.

I finally decided to get Perl installed into PostgreSQL because PostgreSQL has all the features I like :

A real language for stored procedures (PL/pgSQL).
Nice handling of large objects.
Embedded Perl commands.
Is similar to Oracle in many ways, thus making the transition from Oracle to PostgreSQL or vice versa reasonable.
Has many advanced features that I desire with a database server.
Has a free web book. I am big on free documentation.

The overall process was a pain because of slight adjustments here and there. . Here are the basic steps:

Install Perl 5.6.1. Use all the default options except for two changes.
Install PostgreSQL after you install Perl.
Install Perl into PostgreSQL, and make one fix.

Downloading and installing Perl.

Make sure you install Perl before you install PostgreSQL. I don't know if the latest versions of RedHat 7.1 or Debian have libperl as a shared module.

cd /usr/local/src
lynx --source http://www.tcu-inc.com/perl5.6.1.tgz > perl-5.6.1.tgz
tar -zxvf perl-5.6.1.tgz
cd perl-5.6.1
rm -f config.sh Policy.sh
sh Configure

Change the default prefix to "/usr" instead of "/usr/local". Also, when it asks the question "Build a shared libperl.so (y/n) [n] ", answer y. Press enter for any other question.

make
make install

Downloading and Installing PostgreSQL with Perl.

When I downloaded PostgreSQL, I also tried to install interfaces for tcl, c, python, and obdc. I haven't tried JAVA, but it is an option. Also, if you are going to use Perl with PostgreSQL, I recommend downloading and installing DBI and DBD:Pg from cpan.perl.com.

Tcl and Perl are options in the procedural languages. You can actually execute Perl and Tcl inside sql commands. Also, you get the standard PL/pgSQL procedural language (which is similar to pl/sql). Here are the steps I used to install PostgreSQL with Perl. Here is a text file with the same information.

In the home directory of the user postgres, make a file called ".profile" and put this in it.

#!/usr/bin

PATH=/usr/local/pg711/bin:$PATH
export PATH
export LD_LIBRARY_PATH=/usr/local/pg711/lib
export PGDATA=/usr/local/pg711/data
export PGLIB=/usr/local/pg711/lib
export POSTGRES_HOME=/usr/local/pg711

Then, execute this command,

chmod 755 .profile

Example perl/sql commands.

Execute the commands at, http://www.ca.postgresql.org/users-lounge/docs/7.1/programmer/plperl-use.html

Since I had you create the database "postgres", all you have to do is enter these two commands starting as the user "root" to get into the psql interface.

su -l postgres
psql

This assumes you also correctly setup .profile for the user postgres. If you didn't, then follow these commands:

su -l postgres
PATH=/usr/local/pg711/bin:$PATH
export PATH
export LD_LIBRARY_PATH=/usr/local/pg711/lib
export PGDATA=/usr/local/pg711/data
export PGLIB=/usr/local/pg711/lib
export POSTGRES_HOME=/usr/local/pg711
psql

The following function lets you search the data and return a copy of the name if the name contains the text you search for with a case insensitive option.

drop function search_name(employee,text,integer);
CREATE FUNCTION search_name(employee,text,integer) RETURNS text AS '
    my $emp = shift;
    my $Text = shift;
    my $Case = shift;

    if (($Case > 0) && ($emp->{''name''} =~ /\\Q$Text\\E/i)) 
      { return $emp->{''name''}; }
    elsif ($Case > 0) {return "";}
    elsif ($emp->{''name''} =~ /\\Q$Text\\E/) 
       {    return $emp->{''name''}; }
    else { return "";}
' LANGUAGE 'plperl';

insert into EMPLOYEE values ('John Doe',10000,1);
insert into EMPLOYEE values ('Jane Doe',10000,1);
insert into EMPLOYEE values ('Giny Majiny',10000,1);

select name,search_name(employee,'j',0) from employee;
select name,search_name(employee,'j',1) from employee;

select name from employee where search_name(employee,'j',1) = name;
select name from employee where search_name(employee,'j',0) = name;

Obviously, the function is a little ridiculous. It should just return 0 for false or 1 for true. But for visual reasons, I have it return a copy of name.

Setting up the tables and pl/perl procedures for the Insert, Update, and Delete pl/pgsql procedures

You can get a copy of the SQL commands for this section here: SQL_setup.txt. There are several things I want to accomplish:

Create insert, update, and delete stored procedures that will backup all changes to a history table or backup table. This will record everything that happens. Reasonable error checking is required. We could do more error checking, but the stored procedures I created would just get too big.
To use a Perl procedures to clean out input being put into the tables. Granted, we could use sql commands, but the perl commands are so much easier for me to read.

I do not believe it is possible to get the perl procedures to execute insert, update, delete, or select commands. The only thing that I have gotten Perl to do is accept values and to output a single value. You should never need Perl to execute sql anyways. You aren't using Perl to execute commands, but to modify data, act as a filter, or check for errors. Use pl/pgsql to handle all the sql commands. Just use Perl to manipulate data and not directly do anything to the database.

Below, I have three tables: jobs, jobs_backup, and contact. I will only create stored procedures for the table 'jobs'. The two perl procedures are only meant to verify that we have valid data to input, and to filter out non-printable characters, and get rid of whitespace. We use pl/pgsql to perform the actual insert, update, and delete commands.

Using this basic method of handling data, you can replicate it for any other table you have.

Some things I have to watch out for is the fact I want unique names for the jobs. I don't want two jobs to have the same name from one recruiter. This gets a little tricky, but it works fine.

Also, I could use a foriegn key restraint so that you cannot have a contact_id in 'jobs' without it existing in 'contact'. The only problem is, we may at some point accidentally delete contact_ids from contact and then things are messed up anyways. The best solution is to add a "active" column to the "jobs" and "contact" tables in which you turn off and on objects. In this way, you never delete unique ids ever.

Insert pl/pgsql procedure

You can get a copy of the SQL commands for this section here: SQL_insert.txt.

Update pl/pgsql procedure

You can get a copy of the SQL commands for this section here: SQL_update.txt. The update procedure has to check to see if there is a job that has the same name we are trying to change the current job to. If there is, we don't want to make any changes (except if the job_id is the same). Did you remember that there is a unique constraint on the name for the same recruiter?

Delete pl/pgsql procedure

You can get a copy of the SQL commands for this section here: SQL_delete.txt.

Considerations to explore.

If you install perl 5.6.1, check to see what happens with mod_perl or any of the other perl modules you custom installed previously. Installing perl 5.6.1 may break modules you were previously using with a different version of Perl. I don't know, but be careful. You may have to recompile modules.

Here is a summary of the things you should consider:

Create a foriegn key constraint so that the contact_id of the jobs has to exist in the contact table.
Never delete a row in the tables 'contact' and 'job'. Instead, add a column to just inactivate them. Set the name of the column to be 'active' where 0 means inactive and 1 means active.
You can combine insert/update procedures into one procedure. If your data is always 100% accurate, then always issue the update procedure, and if the update procedure doesn't find the job, it just inserts it for you. Under some conditions, this can be useful.
Anytime you can ad a check for an error, do it. Although my error checks are reasonable, more can be done.
Forget using Perl procedures to execute direct sql commands. Just use them to manipulate data.
Use pl/pgsql to combine perl procedures with sql commands.
You should setup the procedures to rollback in case an insert, update, or delete couldn't work for some unknown reason.
I don't know how much memory is used up with perl procedures and I don't know how memory gets freed up when a perl procedure is done executing. Also, I don't know the overhead of executing perl procedures. For my purposes, pl/pgsql procedures are always going to be faster than manually executing sql commands using Perl scripts on the webserver side. Since I am headed in the right direction anyways, I am willing to live with any overhead there is with the perl procedures. Besides, I can probably take very complex sql commands and shrink them down into a few lines of Perl code. If I balance out the proper use of pl/pgsql, standard sql, and pl/perl, I see significant power gain and little drawbacks.

Freeing up permissions in Perl

What I am about do to is very bad. It relaxes some of the security issues in Perl so that you can do more stuff.

First, of all, add this method right below the "permit" method in Safe.pm. My Safe.pm was at /usr/local/src/perl-5.6.1/lib/Safe.pm. Changing a module that you did not create means that if you ever update this module, the changes will get wiped. Once more, you MIGHT MESS UP THE PROGRAMMING FROM ONE OF YOUR FRIENDS WHO IS PROGRAMMING ON THAT COMPUTER AS WELL. Again, I am doing some naughty things you should not do.

sub permit_all {
    my $obj = shift;
    $obj->{Mask} = invert_opset full_opset;
}

Second, shut down your database server.

Third, recompile plperl with some changes.

Make these changes in the file plperl.c. From this

To This (which you can get from this file New_plperl.txt)

Now recompile plperl and install it.

cd /usr/local/src/postgresql-7.1.1/src/pl/plperl
rm -f *.o
make 
make install

Fourth, restart the postgresql database server.

See if you can escape to a shell,

drop function ls_bad ();
CREATE FUNCTION  ls_bad () RETURNS text AS '
my @Temp = `ls /tmp`;
my $List = "@Temp";
$List =~ s/\n/ /g;
  return $List;
' LANGUAGE 'plperl';
select ls_bad();

If you get the contents of your "/tmp" directory, then you can escape to a shell just fine. This is very dangerous.

For a whole day, I was trying to figure out how to get DynaLoader to work in pl/perl. Basically, I read documentation about how to embed Perl in C, and it isn't that hard to do. There is even a manpage about it. I kept on running into problems. Lastly, I tried to not use the Safe.pm module altogether, but I didn't get very far. I was so close to compiling Dynaloader into plperl, but I gave up. After blowing off a day, I want someone else to give it a try.

If you can get DynaLoader to work properly with plperl, or more accurately, you find a way to make it so I can load any module I want with plperl, then please let me know. I got to the point where I could load pure pm modules, but not modules which had c components. I would like to be able to load any module whatsoever. I believe we have to stop using Safe.pm to make it easier. Please send email to articles@gnujobs.com. I would be very interested if you succeed!

Please don't do this. I only wanted to show you how you can get around security issues if you really wanted to.

Conclusion

Combining Perl with PL/PGSQL is a REALLY REALLY COOL thing. Why?

I like use Perl to manipulate the data because SQL is such a pain the in butt sometimes (in manipulating data).
The combination of Perl and PL/PGSQL can make it so most of the work is done on the database end, which means, you can do less programming on the client end. For example, let us say you have a webserver that connects to a database server. If the database server is handling a lot of the perl work, your perl scripts on the webserver won't be as big. This is true for stored procedures in general anyways. Is isn't a lot nicer to execute one stored procedure than to have Perl scripts on the webserver side executing all those steps (that the procedure does for you)?
If you can do all the fancy perl programming on the database end, then you will have less work with any language that you choose to connect to your database server. Thus, your stored procedures become objects that your web programmers just have to understand how to use, but not understand how they were made. This is very nice.
I am going to slowly use more perl procedures (where appropriate) and test the stability of pl/perl.
I would like to be able to load any module into plperl just for giggles. If you find out how to do with, please send me email at articles@gnujobs.com.

PostgreSQL is by far the coolest database server I have ever worked with. MySQL comes a close second. I never really enjoyed working on any commercial database server. I see so much more potential with PostgreSQL, that I actually see commercial database servers following some of the things PostgreSQL will do. I am very eager to use the python interface being developed at http://www.ca.postgresql.org/cgi/cvsweb.cgi/pgsql/src/pl/ I believe the ability to have procedures written in different programming languages will become very valuable in the future. I am also eager to see if we can get procedures to return more than just one value. It is very annoying that we can only return one value. I tried to define a function with more than one return value, and it didn't work. I tried to get a procedure to return a RECORD, but I didn't get very far.

References

Procedural Languages
PostgreSQL: Introduction and Concepts
A recent article, http://www.newbienetwork.net/sections.php?op=viewarticle&artid=25
If this article changes, it will be available here http://www.gnujobs.com/Articles/20/Perl_PostgreSQL.html

Mark works as an independent consultant donating time to causes like GNUJobs.com, writing articles, writing free software, and working as a volunteer at eastmont.net.