Introduction
In Adjust we are creating new data types for PostgreSQL, in order to make handling of data easier for us, but also sometimes to save some space. Many of them are written using the C language. Here are some of the more important extensions for us:
- istore: it is an integer based
hstore. In
istore
both keys and values are represented and stored as integers. - pg-base36: a base36 extension, which implements a base36 binary-to-text encoding algorithm.
- ajbool: a boolean type that allows
unknown
state without being NULL.
In addition to that, among the other extensions are a lot which add new enum data types:
- pg-os_name: an extension which adds several enum values representing different operating systems. You can see a list of such values in the source code.
- pg-country: an extension which adds enum values representing countries of the world.
- pg-device_type: an extension which adds enum values representing different device types.
We have quite a few other ENUM extensions that add enum data types but they are private and cannot be shared to the public as of now. To generate new PostgreSQL enum data types,we use pg_type_template which is a Python package that generates all necessary PostgreSQL extension files for us.
Usage
To use pg_type_template
you need to install it as a Python package. For that,
first you have to initialize a virtual environment:
python3 -m venv venv
source venv/bin/activate
pip install git+ssh://git@github.com/adjust/pg_type_template.git
Example
Once the package is installed, you can prepare a Python file which contains new
data type definitions. Here is an example for such a file (type_templ.py
):
import pg_type_template
model = {
"ext_name": "example_type",
"ext_version": "0.0.1",
"types": [
{
"type_name": "example_type",
"type_values": [
{ "name": "enum 1", "value": 60 },
{ "name": "enum 2", "value": 120 },
{ "name": "enum 3", "value": 180 },
{ "name": "unknown", "value": 255 },
]
}
]
}
templ = pg_type_template.TypeTemplate(model)
templ.render_to_dir(".")
ext_name
: name of the new extension which will be used for a binary file name and for a PostgreSQL extension object.ext_version
: version of the new extension when it will be installed in a database.type_name
: name of the new data type which will be created after the extension is installed. Here in the example the script defines only one data type, but it can define more than one data type.type_values
: list of enum names and their values. Values cannot be less than0
and greater than255
.templ.render_to_dir(".")
: generates a new extension with all necessary files in the directory which is passed as an argument.
Once the script is prepared the PostgreSQL extension can be generated by running the following command:
python3 type_templ.py
The result is a regular PostgreSQL extension, which can be compiled and installed.
Summary
pg_type_template
allows you to generate all necessary PostgreSQL extension
files to add a new enum data type, including source code in C, regression tests,
extension installation SQL script, .control file and Makefile. It also includes
a Github Actions workflow file.
Importantly the extension installation SQL script contains all main operators
to work with new enum types as well as btree
and hash
operator classes.
It however has some limitations. For example enum values can be in range from
0
to 255
only, this is only because each enum value is uint8
value to fit
into 1 byte. Additionally, the generated extension doesn’t have any complex
logic, but the code can be added afterwards.