Showing posts with label development. Show all posts
Showing posts with label development. Show all posts

23 Apr 2024

Installing pg_tle on Ubuntu: A Quick Guide

Compile & Install pg_tle on Postgres

PostgreSQL is a powerful database, but sometimes you want to extend its functionality with custom features, and that's where extensions like pg_tle (Trusted Language Extensions) come into play.

If you're new to pg_tle, here's a quick round-up of why it makes sense for you - See Unlock PostgreSQL Superpowers with pg_tle.

Given the power of pg_tle, you may want to install it locally (on your laptop or an EC2 instance) before deploying to environments with restricted access (such as Production, or PostgreSQL services offered by major cloud providers). This is not only helpful to thoroughly test your code, but also to save on cost given that all development can then happen on-premise.

In this blog post, we'll go through the process of compiling and installing pg_tle for postgres on your Ubuntu system.

Prerequisites

An operating system running Ubuntu (this guide assumes Ubuntu 20.04 or similar).

  • A PostgreSQL database server, installed and running.
  • Basic familiarity with the command line and postgresql.conf.
  • Some development tools (we'll install these as we go).

Steps

Install Build Tools and Dependencies:

Start by updating your Ubuntu package list & install the necessary tools and libraries (This is required, since we would be compiling the pg_tle extension by source).

$ sudo apt update
$ sudo apt install build-essential make git postgresql-server-dev-all 

The postgresql-server-dev-all package version may need to be adjusted to match your specific PostgreSQL version. If this doesn't work for you, you can instead read more about setting up your Ubuntu operating system (albeit in a dated post) here - See Setup Linux for PostgreSQL development [3].

Download pg_tle Source Code:

Get the pg_tle source code from the GitHub project repository:

$ git clone https://github.com/aws/pg_tle.git

Compile pg_tle:

Compile the source code to create the extension files:

$ cd pg_tle
$ make

Although rare, make may fail if it can't find pg_config. As in the example below, a quick hack could be to help by pointing make to the pg_config binary location:

$ make                                     <<===== Fails
Makefile:24: /usr/lib/postgresql/15/lib/pgxs/src/makefiles/pgxs.mk: No such file or directory
make: *** No rule to make target '/usr/lib/postgresql/15/lib/pgxs/src/makefiles/pgxs.mk'.  Stop.

$ PG_CONFIG="`type -P -a pg_config`" make  <<===== Works successfully
gcc -Wall -Wmissing-prototypes ...
.
.
. (compiling starts successfully)

Install pg_tle:

Install the compiled extension into your PostgreSQL database. This command would install the extension related files to the postgres binaries folder, pointed to by PG_CONFIG:

$ sudo make install

Enable pg_tle in Your Database:

Connect to your PostgreSQL database using your preferred tool (e.g., psql) and run the following SQL command:

test_pgtle=# CREATE EXTENSION pg_tle;
CREATE EXTENSION

Verification:

To confirm pg_tle is installed correctly, run this SQL query:

test_pgtle=# SELECT * FROM pg_available_extensions WHERE name = 'pg_tle';
-[ RECORD 1 ]-----+-------------------------------------------
name              | pg_tle
default_version   | 1.4.0
installed_version | 1.4.0
comment           | Trusted Language Extensions for PostgreSQL

You should see a result similar to the above, where installed_version confirms the pg_tle version that's installed successfully.

Conclusion

You've now successfully compiled and installed the pg_tle extension on your Ubuntu system! This opens up the possibility to create and deploy custom extensions to enhance your PostgreSQL database.

pg_tle is a powerful tool that allows you to develop more advanced extensions. You can find more information and examples in the official pg_tle documentation at https://github.com/aws/pg_tle.

If you're intrigued, keep an eye out for a follow-up post where I'll show a simple example of how to use pg_tle extension for a real-world need!

References

  1. Unlock PostgreSQL Superpowers with pg_tle - https://www.thatguyfromdelhi.com/2024/04/unlock-postgresql-superpowers-with-pgtle.html
  2. Setup Linux for PostgreSQL development - https://www.thatguyfromdelhi.com/2011/12/setup-ubuntu-for-postgresql-development.html

20 Apr 2024

Unlock PostgreSQL Superpowers with pg_tle

pg_tle - A Must-Know for Developers

PostgreSQL is a fantastic database, packed with features. But sometimes, you need to add a little something extra – a custom function, a specialized data type, or maybe a procedure written in your favorite programming language. That's traditionally where PostgreSQL extensions come in. But for developers working with managed databases (from major cloud providers), while installing supported extensions is trivial installing custom or unsupported extensions can be tricky or even impossible.

Enter pg_tle!

What exactly is pg_tle?

pg_tle stands for Trusted Language Extensions. Here's the breakdown:

  • Extension: A piece of software that adds new functionality to PostgreSQL.
  • Trusted Language: A programming language (like JavaScript, Perl, or PL/pgSQL) that the database 'trusts' due to security restrictions built into the language itself.
  • The Magic: pg_tle provides a framework to build, package, and install extensions in these trusted languages, even in environments where you can't normally touch the underlying server.

Why is pg_tle Important for Application Developers?

  1. Unlock New Possibilities: pg_tle lets you create custom database functions tailored to your application's unique logic. Say you need complex data validation or even specialized calculations – pg_tle can make it happen.

  2. Bypass Restrictions: On managed database services, you often can't install traditional extensions. pg_tle works within these constraints allowing you to add functionality even in these environments - See Use Cases below for more.

  3. Enhanced Security: Because pg_tle uses trusted languages, your extensions have limits on what they can do. It leads to a more secure database overall.

  4. Open Sourcepg_tle is open source [3] and available on Github (Apache 2.0 License).

Use Cases - Just A Sample Of Ideas

  • Use Unsupported Extensions: This itself is a reason big enough to try out pg_tle! [4]
  • Custom Password Strength Checks: Go beyond basic password rules with an extension.
  • Login Triggers: Build custom rules that get triggered for every time a user logs into the database - not just for (the upcoming) v17 but also for older PostgreSQL versions!
  • Custom Data-Types: Build custom data-types that you could use to store data, functions and views within your database.
  • Data Transformations: Perform complex data manipulations directly within the database.

The Trade-offs

  • Complexity: Creating pg_tle extensions can be more involved than basic SQL scripting.
  • Limitations: Trusted languages still have constraints compared to the full capability of extensions developed in lower-level languages (like C).

Should You Learn More?

If any of the following resonate with you, then deeper pg_tle knowledge could be a big win:

  • You crave more flexibility in how you work with data, and although have found an extension it's not yet natively supported in your PostgreSQL installation.
  • Your application has a few "extra tricky" calculations or data processing needs, that are much easily possible with Perl, Javascript, PL/PGSQL.
  • You work with managed databases and miss those power-user extensions.

Conclusion

pg_tle is a powerful tool that adds flexibility and extensibility for PostgreSQL developers. While a bit more advanced, understanding pg_tle unlocks a new level of database customization for your applications.

If you're intrigued, keep an eye out for a follow-up post where I'll show a simple example of building a pg_tle extension!

References

  1. Trusted Language Extensions for PostgreSQL on Amazon Aurora and Amazon RDS - https://aws.amazon.com/blogs/aws/new-trusted-language-extensions-for-postgresql-on-amazon-aurora-and-amazon-rds/
  2. Creating custom data-types using Trusted Language Extensions - https://aws.amazon.com/blogs/database/create-custom-postgresql-data-types-using-trusted-language-extensions/
  3. pg_tle is open source! - https://github.com/aws/pg_tle
  4. pg_tle makes using custom extensions easier - https://aws.amazon.com/blogs/opensource/supabase-makes-extensions-easier-for-developers-with-trusted-language-extensions-for-postgresql/
  5. Installing pg_tle on Ubuntu - https://www.thatguyfromdelhi.com/2024/04/installing-pgtle-on-ubuntu-quick-guide.html

29 Dec 2020

Which SQL causes a Table Rewrite in Postgres?

EDIT: Updated to v17 (devel) - (Jan 2024).


While developing SQL based applications, it is commonplace to stumble on these 2 questions:

  1. What DDLs would block concurrent workload?
  2. Whether a DDL is going to rewrite the table (and in some cases may need ~ 2x disk space)?

Although completely answering Question 1 is beyond the scope of this post, one of the important pieces that helps answering both of these questions is whether a DDL is going to cause a relfilenode change..

For a brief background, each regular table in Postgres stores data in one or more files, each of which is referenced in the postgres catalog with a relfilenode. A simple way to check whether the current implementation is going to create / refer to another copy (file) is whether the relfilenode changes. (TRUNCATE is a standout here, which by design is going to purge the table data, so although the relfilenode would change here, in total it obviously wouldn't consume anywhere close to 2x disk-space)

The table below shows which DDLs would cause a table rewrite. As has been discussed here, we need some more info to completely answer Question 1, however meanwhile this table helps in making some concurrency / disk-usage related decisions for all Postgres versions supported today.



14 Oct 2017

First alpha release of PsqlForks - Menon

Primer: PsqlForks aims to support all DB Engines that (even partially) speak Postgres (psqlforks = psql for Postgres forks).

Given that PsqlForks has been in development for a few weeks, it's time to stabilize a bit and towards that, we finally have Menon, PsqlForks first Alpha Release. Being an alpha, by definition it isn't ready for production, but it feels stable enough ... feel free to test it out!

Importantly, this fork is synced with postgres/master regularly, and should ideally sport all recent psql developments. Further, I am not a C expert and am just barely comprehending Postgres, so let me know of any 18-wheelers that I didn't see.

The release title - 'Menon', is a common sub-Caste in South-Indian state of Kerala. Selecting this nomenclature emanates from the idea of popularizing (heh!) common names and places from Kerala... and that it doesn't hurt to have an identifiable name (and while at it, add character) to a Release :) 

This release includes: 

  • Decent support for Redshift:
    • SQL tab completion for Redshift related variations
    • \d etc. now support Redshift specifics - ENCODINGs / SORTKEYs / DISTKEY / COMPRESSION etc.
    • Support Temporary Credentials using IAM Authentication (via AWS CLI)
    • View detailed progress here.
  • Basic support / Recognition semantics for:
    • CockroachDB - view progress here
    • PipelineDB
    • PgBouncer
    • RDS PostgreSQL
You could read more here:

For the interested:

13 Oct 2017

PsqlForks supports AWS IAM authentication for Redshift

With this commit, PsqlForks ( http://psqlforks.com ) can now fetch credentials from AWS IAM. Read more about Redshift's support for generating database credentials using IAM authentication feature, here.

Since the entire AWS CLI isn't baked into PsqlForks (yet!), you'd need a working copy of AWS CLI installed / working on the host (from where psql is called).

This took a while, since I missed the basic assumption that Redshift enforces SSL and psql doesn't attempt SSLMODE by default in the first try. The fact that CYGWIN wasn't super-smooth with AWS CLI in my test installation, didn't help either.

But as they say, all's well that ends well. There are few obvious additions that are possible (such as expiration validation / re-use unexpired credentials on re-connect etc.) but this should get merged in the forks mainline soon.

I guess it's time to begin thinking of releases, instead of making the mainline jittery with feature additions such as this one.

Yenjoy!


$ psql "sslmode=require host=redshift_cluster port=5439 dbname=redshift2" -U testing1
Password for user testing1:
psql: fe_sendauth: no password supplied

$ psql -I "sslmode=require host=redshift_cluster port=5439 dbname=redshift2" -U testing1

CLI: aws redshift get-cluster-credentials --auto-create --db-user testing1 --cluster-identifier redshift2 # Informational / testing output

psql (client-version:11devel, server-version:8.0.2, engine:redshift)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: on)
Type "help" for help.

redshift2=> select current_user;
 current_user
--------------
 testing1
(1 row)

redshift2=> \du
                     List of roles
 Role name |          Attributes           | Member of
-----------+-------------------------------+-----------
 redshift2 | Superuser, Create DB         +|
           | Password valid until infinity |
 testing1  |                               |

redshift2=> \q

$ ./psql --help | grep -i iam
  -I, --aws-iam-redshift   use temporary database credentials from AWS IAM Service

12 Oct 2017

PsqlForks now recognizes RDS PostgreSQL separately

With this commit, PsqlForks ( http://psqlforks.com ) now recognizes RDS PostgreSQL separately.

This isn't utilized much yet, but the infrastructure is going to be helpful in skipping / avoiding some commands that are defunct / not possible in the RDS PostgreSQL offering.

29 Sept 2017

PsqlForks now recognizes PgBouncer

With this commit, PsqlForks knows when it's talking to PgBouncer (and not Postgres).

Down the line, this should pave way for PsqlForks to more cleanly convey why (most of) the given psql shortcut(s) don't work (and what else does).

As always, the psql/README always has the most updated status of any engine support.

$ psql -h localhost -E -p6543 -U postgres pgbouncer
psql (client-version:11devel, server-version:1.7.1/bouncer, engine:pgbouncer)
Type "help" for help.

pgbouncer=# show version;
NOTICE:  pgbouncer version 1.7.1
SHOW
pgbouncer=#

25 Sept 2017

PsqlForks now supports CockroachDB

PsqlForks now supports CockroachDB as much as is currently possible. You can check it's current SQL status here.

$ /opt/postgres/master/bin/psql -h localhost -E -p 26257 -U root
psql (client-version:11devel, server-version:9.5.0, engine:cockroachdb)
Type "help" for help.

root=> select version();
                                version()
--------------------------------------------------------------------------
 CockroachDB CCL v1.0.6 (linux amd64, built 2017/09/14 15:15:48, go1.8.3)
(1 row)
bank=> \l
                                      List of databases
        Name        | Owner |     Encoding      |  Collate   |   Ctype    | Access privileges
--------------------+-------+-------------------+------------+------------+-------------------
 bank               |       | Not Supported Yet | en_US.utf8 | en_US.utf8 | Not Supported Yet
 crdb_internal      |       | Not Supported Yet | en_US.utf8 | en_US.utf8 | Not Supported Yet
 information_schema |       | Not Supported Yet | en_US.utf8 | en_US.utf8 | Not Supported Yet
 pg_catalog         |       | Not Supported Yet | en_US.utf8 | en_US.utf8 | Not Supported Yet
 system             |       | Not Supported Yet | en_US.utf8 | en_US.utf8 | Not Supported Yet
(5 rows)
bank=> \dv
      List of relations
 Schema | Name | Type | Owner
--------+------+------+-------
 bank   | a    | view |
(1 row)

bank=> \di
                       List of relations
 Schema |          Name           | Type  | Owner |   Table
--------+-------------------------+-------+-------+------------
 bank   | primary                 | index |       | accounts
 system | jobs_status_created_idx | index |       | jobs
 system | primary                 | index |       | descriptor
 system | primary                 | index |       | eventlog
 system | primary                 | index |       | jobs
 system | primary                 | index |       | lease
 system | primary                 | index |       | namespace
 system | primary                 | index |       | rangelog
 system | primary                 | index |       | settings
 system | primary                 | index |       | ui
 system | primary                 | index |       | users
 system | primary                 | index |       | zones
(12 rows)

15 Sept 2017

PsqlForks now supports PipelineDB

After working on this PSQL variant that intends to support all Postgres forks, I finally narrowed down to naming it.

Since this was essentially Psql (for) Forks, quite intuitively, I chose to name it PsqlForks.

Considering that until recently this fork just supported Amazon Redshift, this naming didn't make much sense if it wasn't supporting at least 2 forks :) !

Thus, PsqlForks now supports PipelineDB!


$  /opt/postgres/master/bin/psql -U pipeline -p 5434 -h localhost pipeline
psql (client-version:11devel, server-version:9.5.3, engine:pipelinedb)
Type "help" for help.

pipeline=# \q

2 Sept 2017

psql \d now supports Interleaved / Compound SORTKEYs (in Redshift)

In continuation of support for Redshift series, now Describe Table (for e.g. \d tbl) shows SORTKEY details. This resolves Issue #6 and shows both COMPOUND / INTERLEAVED variations along with all the column names.

This change was complicated because Redshift doesn't natively support LISTAGG() function on System / Catalog tables, which meant that I had to resort to a pretty verbose workaround. This in-turn meant that this patch shows only the first ten COMPOUND SORTKEYs of a table. Seriously speaking, it would really take an extreme corner-case, for someone to genuinely require a SORTKEY with 10+ columns.

This is not a limitation for INTERLEAVED SORTKEY since it only supports a maximum of 8 Columns.


db=# CREATE TABLE tbl_pk(custkey SMALLINT PRIMARY KEY);
CREATE TABLE
db=# \d tbl_pk
                                           Table "public.tbl_pk"
 Column  |   Type   | Encoding | DistKey | SortKey | Preload | Encryption | Collation | Nullable | Default
---------+----------+----------+---------+---------+---------+------------+-----------+----------+---------
 custkey | smallint | lzo      | f       | 0       | f       | none       |           | not null |
Indexes:
 PRIMARY KEY, btree (custkey)

db=# CREATE TABLE tbl_compound(
db(#   custkey   SMALLINT                ENCODE delta NOT NULL,
db(#   custname  INTEGER DEFAULT 10      ENCODE raw NULL,
db(#   gender    BOOLEAN                 ENCODE RAW,
db(#   address   CHAR(5)                 ENCODE LZO,
db(#   city      BIGINT identity(0, 1)   ENCODE DELTA,
db(#   state     DOUBLE PRECISION        ENCODE Runlength,
db(#   zipcode   REAL,
db(#   tempdel1  DECIMAL                 ENCODE Mostly16,
db(#   tempdel2  BIGINT                  ENCODE Mostly32,
db(#   tempdel3  DATE                    ENCODE DELTA32k,
db(#   tempdel4  TIMESTAMP               ENCODE Runlength,
db(#   tempdel5  TIMESTAMPTZ             ENCODE DELTA,
db(#   tempdel6  VARCHAR(MAX)            ENCODE text32k,
db(#   start_date VARCHAR(10)            ENCODE TEXT255
db(# )
db-# DISTSTYLE KEY
db-# DISTKEY (custname)
db-# COMPOUND SORTKEY (custkey, custname, gender, address, city, state, zipcode, tempdel1, tempdel2, tempdel3, tempdel4, tempdel5, start_date);
CREATE TABLE
db=#
db=# \d tbl_compound
                                                                 Table "public.tbl_compound"
   Column   |            Type             | Encoding  | DistKey | SortKey | Preload | Encryption | Collation | Nullable |              Default
------------+-----------------------------+-----------+---------+---------+---------+------------+-----------+----------+------------------------------------
 custkey    | smallint                    | delta     | f       | 1       | f       | none       |           | not null |
 custname   | integer                     | none      | t       | 2       | f       | none       |           |          | 10
 gender     | boolean                     | none      | f       | 3       | f       | none       |           |          |
 address    | character(5)                | lzo       | f       | 4       | f       | none       |           |          |
 city       | bigint                      | delta     | f       | 5       | f       | none       |           |          | "identity"(494055, 4, '0,1'::text)
 state      | double precision            | runlength | f       | 6       | f       | none       |           |          |
 zipcode    | real                        | none      | f       | 7       | f       | none       |           |          |
 tempdel1   | numeric(18,0)               | mostly16  | f       | 8       | f       | none       |           |          |
 tempdel2   | bigint                      | mostly32  | f       | 9       | f       | none       |           |          |
 tempdel3   | date                        | delta32k  | f       | 10      | f       | none       |           |          |
 tempdel4   | timestamp without time zone | runlength | f       | 11      | f       | none       |           |          |
 tempdel5   | timestamp with time zone    | delta     | f       | 12      | f       | none       |           |          |
 tempdel6   | character varying(65535)    | text32k   | f       | 0       | f       | none       |           |          |
 start_date | character varying(10)       | text255   | f       | 13      | f       | none       |           |          |
Indexes:
 COMPOUND SORTKEY (address,tempdel2,start_date,custkey,zipcode,tempdel4,city,state,tempdel3,custname)

db=# CREATE TABLE tbl_interleaved(custkey SMALLINT) INTERLEAVED SORTKEY (custkey);
CREATE TABLE
db=# \d tbl_interleaved
                                      Table "public.tbl_interleaved"
 Column  |   Type   | Encoding | DistKey | SortKey | Preload | Encryption | Collation | Nullable | Default
---------+----------+----------+---------+---------+---------+------------+-----------+----------+---------
 custkey | smallint | none     | f       | 1       | f       | none       |           |          |
Indexes:
 INTERLEAVED SORTKEY (custkey)

As a side-note, there is a consideration as to whether this should be on a separate section of its own (and not under Indexes, which it clearly isn't). May be another day. Happy Redshifting :) !

Update (15th Sep 2017):
This project has now been named PsqlForks!

Display IMDb Ratings on Einthusan

Technical Features ...