YugabyteDB is a Distributed Database with built-in resilience, seamless scalability, and flexible geo-distribution. It provides PostgreSQL API compatibility by forking and reusing the upper half of PostgreSQL for connection management, query parsing, planning, and execution. It also includes many customizations including in-built connection manager and a Cost Based Optimizer. The lower half, which includes transaction management, WAL, and the storage engine, harnesses modern solutions including Hybrid Logical Clocks, an optimized fork of RocksDB, and Raft.
One of PostgreSQL’s many great features is native support for extensions. Extensions range from a simple collection of SQL functions, like pgcrypto
, to a complete index implementation, like pgvector
. There is a huge ecosystem of extensions available, and YugabyteDB (as a PostgreSQL fork), supports many of these.
In this blog, we take an in-depth look at the pg_cron
extension, a cron-based job scheduler that serves as a building block for several other important PostgreSQL extensions, and how it works in YugabyteDB.
pg_cron in PostgreSQL
pg_cron
is a simple cron-based job scheduler for PostgreSQL that runs inside the database as an extension. It allows you to schedule tasks using SQL commands, utilizing the familiar Unix cron syntax.
This integration enables the efficient management and automation of tasks within the database environment, seamlessly incorporating scheduling into existing SQL workflows. It is tied to PostgreSQL RBAC, enabling you to configure fine-grained access controls and audit the jobs.
pg_cron
provides a simple set of primitives, allowing it to serve as a foundational scheduling utility on top of which you can build more sophisticated schedulers. For example, the partition management extension for PostgreSQL, pg_partman
, depends on pg_cron
. Check out some additional impressive examples from its creator, Marco Slot.
Here is an example of a simple job to periodically insert data into a table. pg_cron
has been added to shared_preload_libraries
in postgresql.conf
:
postgres=# CREATE EXTENSION pg_cron;
CREATE EXTENSION
postgres=# CREATE TABLE tbl1(id SERIAL PRIMARY KEY, insert_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, data TEXT);
postgres=# SELECT cron.schedule('my job', '* * * * *', 'INSERT INTO tbl1(data) VALUES (''cron was here'')');
schedule
----------
1
(1 row)
postgres=# SELECT pg_sleep(180);
pg_sleep
----------
(1 row)
postgres=# SELECT * FROM tbl1 ORDER BY id;
id | insert_time | data
----+----------------------------+---------------
1 | 2025-02-06 15:28:00.010105 | cron was here
2 | 2025-02-06 15:29:00.008067 | cron was here
3 | 2025-02-06 15:30:00.009117 | cron was here
(3 rows)
pg_cron Tables and Functions
The pg_cron
extension is installed in a single database, which can be controlled using the cron.database_name
parameter. This database stores the internal metadata for pg_cron
.
Jobs can be scheduled to run on any database using the cron.schedule_in_database
function.
The cron metadata is stored in two tables.
The cron.job table stores the metadata for each scheduled job.
Table "cron.job"
Column | Type | Collation | Nullable | Default
----------+---------+-----------+----------+-------------------------------------
jobid | bigint | | not null | nextval('cron.jobid_seq'::regclass)
schedule | text | | not null |
command | text | | not null |
nodename | text | | not null | 'localhost'::text
nodeport | integer | | not null | inet_server_port()
database | text | | not null | current_database()
username | text | | not null | CURRENT_USER
active | boolean | | not null | true
jobname | text | | |
Indexes:
"job_pkey" PRIMARY KEY, lsm (jobid HASH)
"jobname_username_uniq" UNIQUE CONSTRAINT, lsm (jobname HASH, username ASC)
Policies:
POLICY "cron_job_policy"
USING ((username = (CURRENT_USER)::text))
Triggers:
cron_job_cache_invalidate AFTER INSERT OR DELETE OR UPDATE OR TRUNCATE ON cron.job FOR EACH STATEMENT EXECUTE PROCEDURE cron.job_cache_invalidate()
Take a look at our example job:
postgres=# SELECT * FROM cron.job;
jobid | schedule | command | nodename | nodeport | database | username | active | jobname
-------+-----------+-------------------------------------------------+-----------+----------+----------+------------+--------+--------
-
1 | * * * * * | INSERT INTO tbl1(data) VALUES ('cron was here') | localhost | 9712 | postgres | documentdb | t | my job
(1 row)
The cron.job_run_details
table records an entry for each job execution, referred to as a job run.
Table "cron.job_run_details"
Column | Type | Collation | Nullable | Default
----------------+--------------------------+-----------+----------+-------------------------------------
jobid | bigint | | |
runid | bigint | | not null | nextval('cron.runid_seq'::regclass)
job_pid | integer | | |
database | text | | |
username | text | | |
command | text | | |
status | text | | |
return_message | text | | |
start_time | timestamp with time zone | | |
end_time | timestamp with time zone | | |
Indexes:
"job_run_details_pkey" PRIMARY KEY, lsm (runid HASH)
Policies:
POLICY "cron_job_run_details_policy"
USING ((username = (CURRENT_USER)::text))
Our example job has one entry for each row it inserted:
postgres=# SELECT * FROM cron.job_run_details ORDER BY runid;
jobid | runid | job_pid | database | username | command | status | return_message | start_time | end_time
-------+-------+---------+----------+------------+-------------------------------------------------+-----------+----------------+-------------------------------+-------------------------------
1 | 1 | 2980 | postgres | documentdb | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 15:28:00.000296+00 | 2025-02-06 15:28:00.012672+00
1 | 2 | 2985 | postgres | documentdb | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 15:29:00.000113+00 | 2025-02-06 15:29:00.011825+00
1 | 3 | 2988 | postgres | documentdb | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 15:30:00.000351+00 | 2025-02-06 15:30:00.011541+00
(3 rows)
The jobs can be managed using the following functions:
cron.schedule
cron.schedule_in_database
cron.unschedule
cron.alter_job
Background Workers
PostgreSQL employs a single-threaded, multi-process architecture comprising one main postmaster process and multiple postgres backend and background worker processes. Each user connection is handled by a dedicated backend process.
Background processes include the stats collector, WAL writer, and background writer. PostgreSQL also allows extensions to run background workers.
pg_cron_launcher
is the background worker process responsible for managing the cron jobs. It triggers job runs according to the schedule, tracks the progress of runs, and updates the job_run_details
table. The job runs are executed either by using client connections or other background workers, depending on the cron.use_background_workers
parameter. You can also limit the number of concurrently running jobs by setting the cron.max_running_jobs
parameter.

Execution Guarantees
pg_cron
provides the following guarantees:
- At Most Once execution. Each job will run zero or one time according to its schedule and no more.
- Serial execution per job. If a job run executes for a long duration, causing it to overlap with its next scheduled run,
pg_cron
will not start a second parallel run. Instead, it will wait for the first run to complete, then immediately start a new run. This allows for the jobs to be programmed without concurrency concerns. If parallelism is required, it can be achieved by scheduling multiple jobs of the same kind.
Running pg_cron in YugabyteDB
YugabyteDB strives to be highly compatible with PostgreSQL, so the goal with this extension is to maintain identical user-facing behavior. Since cron is a building block, this is crucial to ensure that user applications, especially other PostgreSQL extensions, work seamlessly on YugabyteDB.
Running pg_cron
on YugabyteDB was a straightforward process. The cron code was pulled into the YugabyteDB repo, and a few minor tweaks were made to connect it to the YugabyteDB catalog and include it in the Postgres conf file. After that, all the pg_cron
regression tests were running and passing! That’s all it takes for most extension ports to YugabyteDB.
However, pg_cron
will encounter a small problem. YugabyteDB runs on several nodes and this would lead to every node running the jobs. This would violate the At Most Once guarantee. To fix this, we need to ensure that pg_cron
runs jobs on only one node at a time. If that node crashes, another node immediately takes over, to ensure resilience.
Stateful Services Within a Distributed Database
Several YugabyteDB features require the capability to run a service on an identifiable and unique primary node that can service requests, and a place to optionally store data in a consistent manner. Like the rest of YugabyteDB, these internal microservices (nanoservice?) and their data must be resilient and highly available.
Each tablet (shard) in YugabyteDB offers the same guarantee via Raft. Raft ensures that there is only one leader and will elect a new leader if the current one fails. For each Stateful Service, a special tablet is created, and the service links itself to the Raft leader life cycle. The leader of this tablet hosts the service and stores the data for the service in an atomic and consistent manner. Internal clients can identify the tablet leader and route requests appropriately.
pg_cron
and AutoAnalyze both currently make use of this framework.
PgCron Leader
The first time pg_cron
extension is created, the PG_CRON_LEADER
service is created. Only the node that hosts this service executes cron jobs. This node is called the PgCronLeaderNode.
The pg_cron_launcher
process is informed about its local node state via shared memory. On the Leader node it works just like in PostgreSQL, whereas on non-leader nodes this process is made inactive.
The advantage of this architecture is that it provides the same guarantees you would expect from Postgres pg_cron
, combined with the advantages of a distributed database.
Let’s start a local 3 node YugabyteDB cluster with pg_cron
:
$./bin/yugabyted start --advertise_address=127.0.0.1 --base_dir=~/yugabyte-data/node1 --cloud_location=aws.us-west.west-2a --master_flags="enable_pg_cron=true" --tserver_flags="enable_pg_cron=true"
$ ./bin/yugabyted start --advertise_address=127.0.0.2 --base_dir=~/yugabyte-data/node2 --cloud_location=aws.us-west.west-2b --master_flags="enable_pg_cron=true" --tserver_flags="enable_pg_cron=true" --join=127.0.0.1
$ ./bin/yugabyted start --advertise_address=127.0.0.3 --base_dir=~/yugabyte-data/node3 --cloud_location=aws.us-west.west-2c --master_flags="enable_pg_cron=true" --tserver_flags="enable_pg_cron=true" --join=127.0.0.1
In the http://127.0.0.1:7000/tablet-servers UI page, we can see the 3 nodes.

Now we can create the extension and run our job like we did in PostgreSQL:
yugabyte=# CREATE EXTENSION pg_cron;
CREATE EXTENSION
yugabyte=# CREATE TABLE tbl1(id SERIAL PRIMARY KEY, insert_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, data TEXT);
CREATE TABLE
yugabyte=# SELECT cron.schedule('my job', '* * * * *', 'INSERT INTO tbl1(data) VALUES (''cron was here'')');
schedule
----------
1
(1 row)
yugabyte=# SELECT pg_sleep(180);
pg_sleep
----------
(1 row)
yugabyte=# SELECT * FROM tbl1 ORDER BY id;
id | insert_time | data
-----+----------------------------+---------------
1 | 2025-02-06 07:58:01.158603 | cron was here
101 | 2025-02-06 07:59:01.10112 | cron was here
201 | 2025-02-06 08:00:01.086597 | cron was here
(3 rows)
We can see the PG_CRON_LEADER
is currently hosted on node-3 (west-2c) via http://127.0.0.1:7000/stateful-services:

Now let’s take down this node:
$ date && ./bin/yugabyted stop --base_dir=~/yugabyte-data/node3
Thu Feb 6 08:00:46 PST 2025
Stopped yugabyted using config /Users/hsunder/yugabyte-data/node3/conf/yugabyted.conf.

The PG_CRON_LEADER service has automatically moved to a new node, node-2 (west-2b) in this case:

If we query our table, you can see it was unaffected by this movement:
yugabyte=# SELECT * FROM tbl1 ORDER BY id;
id | insert_time | data
-----+----------------------------+---------------
1 | 2025-02-06 07:58:01.158603 | cron was here
101 | 2025-02-06 07:59:01.10112 | cron was here
201 | 2025-02-06 08:00:01.086597 | cron was here
301 | 2025-02-06 08:01:01.150388 | cron was here
401 | 2025-02-06 08:02:01.035135 | cron was here
501 | 2025-02-06 08:03:01.102009 | cron was here
(6 rows)
yugabyte=# SELECT * FROM cron.job_run_Details ORDER BY runid;
jobid | runid | job_pid | database | username | command | status | return_message | start_time | end_time
-------+-------+---------+----------+----------+-------------------------------------------------+-----------+----------------+-------------------------------+-------------------------------
1 | 1 | 80276 | yugabyte | yugabyte | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 07:58:01.05932-08 | 2025-02-06 07:58:01.184609-08
1 | 2 | 80835 | yugabyte | yugabyte | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 07:59:01.048266-08 | 2025-02-06 07:59:01.130187-08
1 | 3 | 81414 | yugabyte | yugabyte | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 08:00:01.042518-08 | 2025-02-06 08:00:01.116429-08
1 | 101 | 82294 | yugabyte | yugabyte | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 08:01:01.044024-08 | 2025-02-06 08:01:01.162468-08
1 | 102 | 82860 | yugabyte | yugabyte | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 08:02:01.007901-08 | 2025-02-06 08:02:01.069329-08
1 | 103 | 83421 | yugabyte | yugabyte | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 08:03:01.03503-08 | 2025-02-06 08:03:01.259665-08
(6 rows)
For more information about the extension check out the doc page.
Additional Notes
You may wonder why the cron jobs run on a single node instead of being distributed across multiple nodes. There are a couple of reasons for this:
- Distributed query execution. Even though the jobs are executed on one node, their execution will take advantage of the resources of other nodes via YugabyteDB features like automatic sharding, Predicate Push Down, and Cost Based Optimizer.
- Cron jobs in YugabyteDB are not resource-intensive. In PostgreSQL, cron is used to perform expensive maintenance operations like VACUUM, which are not needed in YugabyteDB. The most common use of cron is to run DDLs, like those by
pg_partman
.
We plan to implement distributed job scheduling in a future release as we continually assess and address our customers’ evolving needs in this area.
Conclusion
Reusing PostgreSQL’s code not only ensures YugabyteDB’s high compatibility with the PostgreSQL API, but also enables it to fully leverage the PostgreSQL extension ecosystem. This makes YugabyteDB an ideal choice for modern applications that demand both the familiarity of PostgreSQL and the resilience, high availability, and scalability of a cloud-native database.
The pg_cron
extension in YugabyteDB enables you to seamlessly schedule automated jobs to monitor and maintain your databases. Check out our docs page for more information on how you can incorporate pg_cron
into your YugabyteDB deployments.
YugabyteDB is a Distributed Database with built-in resilience, seamless scalability, and flexible geo-distribution. It provides PostgreSQL API compatibility by forking and reusing the upper half of PostgreSQL for connection management, query parsing, planning, and execution. It also includes many customizations including in-built connection manager and a Cost Based Optimizer. The lower half, which includes transaction management, WAL, and the storage engine, harnesses modern solutions including Hybrid Logical Clocks, an optimized fork of RocksDB, and Raft.
One of PostgreSQL’s many great features is native support for extensions. Extensions range from a simple collection of SQL functions, like pgcrypto
, to a complete index implementation, like pgvector
. There is a huge ecosystem of extensions available, and YugabyteDB (as a PostgreSQL fork), supports many of these.
In this blog, we take an in-depth look at the pg_cron
extension, a cron-based job scheduler that serves as a building block for several other important PostgreSQL extensions, and how it works in YugabyteDB.
pg_cron in PostgreSQL
pg_cron
is a simple cron-based job scheduler for PostgreSQL that runs inside the database as an extension. It allows you to schedule tasks using SQL commands, utilizing the familiar Unix cron syntax.
This integration enables the efficient management and automation of tasks within the database environment, seamlessly incorporating scheduling into existing SQL workflows. It is tied to PostgreSQL RBAC, enabling you to configure fine-grained access controls and audit the jobs.
pg_cron
provides a simple set of primitives, allowing it to serve as a foundational scheduling utility on top of which you can build more sophisticated schedulers. For example, the partition management extension for PostgreSQL, pg_partman
, depends on pg_cron
. Check out some additional impressive examples from its creator, Marco Slot.
Here is an example of a simple job to periodically insert data into a table. pg_cron
has been added to shared_preload_libraries
in postgresql.conf
:
postgres=# CREATE EXTENSION pg_cron; CREATE EXTENSION postgres=# CREATE TABLE tbl1(id SERIAL PRIMARY KEY, insert_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, data TEXT); postgres=# SELECT cron.schedule('my job', '* * * * *', 'INSERT INTO tbl1(data) VALUES (''cron was here'')'); schedule ---------- 1 (1 row) postgres=# SELECT pg_sleep(180); pg_sleep ---------- (1 row) postgres=# SELECT * FROM tbl1 ORDER BY id; id | insert_time | data ----+----------------------------+--------------- 1 | 2025-02-06 15:28:00.010105 | cron was here 2 | 2025-02-06 15:29:00.008067 | cron was here 3 | 2025-02-06 15:30:00.009117 | cron was here (3 rows)
pg_cron Tables and Functions
The pg_cron
extension is installed in a single database, which can be controlled using the cron.database_name
parameter. This database stores the internal metadata for pg_cron
.
Jobs can be scheduled to run on any database using the cron.schedule_in_database
function.
The cron metadata is stored in two tables.
The cron.job table stores the metadata for each scheduled job.
Table "cron.job" Column | Type | Collation | Nullable | Default ----------+---------+-----------+----------+------------------------------------- jobid | bigint | | not null | nextval('cron.jobid_seq'::regclass) schedule | text | | not null | command | text | | not null | nodename | text | | not null | 'localhost'::text nodeport | integer | | not null | inet_server_port() database | text | | not null | current_database() username | text | | not null | CURRENT_USER active | boolean | | not null | true jobname | text | | | Indexes: "job_pkey" PRIMARY KEY, lsm (jobid HASH) "jobname_username_uniq" UNIQUE CONSTRAINT, lsm (jobname HASH, username ASC) Policies: POLICY "cron_job_policy" USING ((username = (CURRENT_USER)::text)) Triggers: cron_job_cache_invalidate AFTER INSERT OR DELETE OR UPDATE OR TRUNCATE ON cron.job FOR EACH STATEMENT EXECUTE PROCEDURE cron.job_cache_invalidate()
Take a look at our example job:
postgres=# SELECT * FROM cron.job; jobid | schedule | command | nodename | nodeport | database | username | active | jobname -------+-----------+-------------------------------------------------+-----------+----------+----------+------------+--------+-------- - 1 | * * * * * | INSERT INTO tbl1(data) VALUES ('cron was here') | localhost | 9712 | postgres | documentdb | t | my job (1 row)
The cron.job_run_details
table records an entry for each job execution, referred to as a job run.
Table "cron.job_run_details" Column | Type | Collation | Nullable | Default ----------------+--------------------------+-----------+----------+------------------------------------- jobid | bigint | | | runid | bigint | | not null | nextval('cron.runid_seq'::regclass) job_pid | integer | | | database | text | | | username | text | | | command | text | | | status | text | | | return_message | text | | | start_time | timestamp with time zone | | | end_time | timestamp with time zone | | | Indexes: "job_run_details_pkey" PRIMARY KEY, lsm (runid HASH) Policies: POLICY "cron_job_run_details_policy" USING ((username = (CURRENT_USER)::text))
Our example job has one entry for each row it inserted:
postgres=# SELECT * FROM cron.job_run_details ORDER BY runid; jobid | runid | job_pid | database | username | command | status | return_message | start_time | end_time -------+-------+---------+----------+------------+-------------------------------------------------+-----------+----------------+-------------------------------+------------------------------- 1 | 1 | 2980 | postgres | documentdb | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 15:28:00.000296+00 | 2025-02-06 15:28:00.012672+00 1 | 2 | 2985 | postgres | documentdb | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 15:29:00.000113+00 | 2025-02-06 15:29:00.011825+00 1 | 3 | 2988 | postgres | documentdb | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 15:30:00.000351+00 | 2025-02-06 15:30:00.011541+00 (3 rows)
The jobs can be managed using the following functions:
cron.schedule
cron.schedule_in_database
cron.unschedule
cron.alter_job
Background Workers
PostgreSQL employs a single-threaded, multi-process architecture comprising one main postmaster process and multiple postgres backend and background worker processes. Each user connection is handled by a dedicated backend process.
Background processes include the stats collector, WAL writer, and background writer. PostgreSQL also allows extensions to run background workers.
pg_cron_launcher
is the background worker process responsible for managing the cron jobs. It triggers job runs according to the schedule, tracks the progress of runs, and updates the job_run_details
table. The job runs are executed either by using client connections or other background workers, depending on the cron.use_background_workers
parameter. You can also limit the number of concurrently running jobs by setting the cron.max_running_jobs
parameter.
Execution Guarantees
pg_cron
provides the following guarantees:
- At Most Once execution. Each job will run zero or one time according to its schedule and no more.
- Serial execution per job. If a job run executes for a long duration, causing it to overlap with its next scheduled run,
pg_cron
will not start a second parallel run. Instead, it will wait for the first run to complete, then immediately start a new run. This allows for the jobs to be programmed without concurrency concerns. If parallelism is required, it can be achieved by scheduling multiple jobs of the same kind.
Running pg_cron in YugabyteDB
YugabyteDB strives to be highly compatible with PostgreSQL, so the goal with this extension is to maintain identical user-facing behavior. Since cron is a building block, this is crucial to ensure that user applications, especially other PostgreSQL extensions, work seamlessly on YugabyteDB.
Running pg_cron
on YugabyteDB was a straightforward process. The cron code was pulled into the YugabyteDB repo, and a few minor tweaks were made to connect it to the YugabyteDB catalog and include it in the Postgres conf file. After that, all the pg_cron
regression tests were running and passing! That’s all it takes for most extension ports to YugabyteDB.
However, pg_cron
will encounter a small problem. YugabyteDB runs on several nodes and this would lead to every node running the jobs. This would violate the At Most Once guarantee. To fix this, we need to ensure that pg_cron
runs jobs on only one node at a time. If that node crashes, another node immediately takes over, to ensure resilience.
Stateful Services Within a Distributed Database
Several YugabyteDB features require the capability to run a service on an identifiable and unique primary node that can service requests, and a place to optionally store data in a consistent manner. Like the rest of YugabyteDB, these internal microservices (nanoservice?) and their data must be resilient and highly available.
Each tablet (shard) in YugabyteDB offers the same guarantee via Raft. Raft ensures that there is only one leader and will elect a new leader if the current one fails. For each Stateful Service, a special tablet is created, and the service links itself to the Raft leader life cycle. The leader of this tablet hosts the service and stores the data for the service in an atomic and consistent manner. Internal clients can identify the tablet leader and route requests appropriately.
pg_cron
and AutoAnalyze both currently make use of this framework.
PgCron Leader
The first time pg_cron
extension is created, the PG_CRON_LEADER
service is created. Only the node that hosts this service executes cron jobs. This node is called the PgCronLeaderNode.
The pg_cron_launcher
process is informed about its local node state via shared memory. On the Leader node it works just like in PostgreSQL, whereas on non-leader nodes this process is made inactive.
The advantage of this architecture is that it provides the same guarantees you would expect from Postgres pg_cron
, combined with the advantages of a distributed database.
Let’s start a local 3 node YugabyteDB cluster with pg_cron
:
$./bin/yugabyted start --advertise_address=127.0.0.1 --base_dir=~/yugabyte-data/node1 --cloud_location=aws.us-west.west-2a --master_flags="enable_pg_cron=true" --tserver_flags="enable_pg_cron=true" $ ./bin/yugabyted start --advertise_address=127.0.0.2 --base_dir=~/yugabyte-data/node2 --cloud_location=aws.us-west.west-2b --master_flags="enable_pg_cron=true" --tserver_flags="enable_pg_cron=true" --join=127.0.0.1 $ ./bin/yugabyted start --advertise_address=127.0.0.3 --base_dir=~/yugabyte-data/node3 --cloud_location=aws.us-west.west-2c --master_flags="enable_pg_cron=true" --tserver_flags="enable_pg_cron=true" --join=127.0.0.1
In the http://127.0.0.1:7000/tablet-servers UI page, we can see the 3 nodes.
Now we can create the extension and run our job like we did in PostgreSQL:
yugabyte=# CREATE EXTENSION pg_cron; CREATE EXTENSION yugabyte=# CREATE TABLE tbl1(id SERIAL PRIMARY KEY, insert_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, data TEXT); CREATE TABLE yugabyte=# SELECT cron.schedule('my job', '* * * * *', 'INSERT INTO tbl1(data) VALUES (''cron was here'')'); schedule ---------- 1 (1 row) yugabyte=# SELECT pg_sleep(180); pg_sleep ---------- (1 row) yugabyte=# SELECT * FROM tbl1 ORDER BY id; id | insert_time | data -----+----------------------------+--------------- 1 | 2025-02-06 07:58:01.158603 | cron was here 101 | 2025-02-06 07:59:01.10112 | cron was here 201 | 2025-02-06 08:00:01.086597 | cron was here (3 rows)
We can see the PG_CRON_LEADER
is currently hosted on node-3 (west-2c) via http://127.0.0.1:7000/stateful-services:
Now let’s take down this node:
$ date && ./bin/yugabyted stop --base_dir=~/yugabyte-data/node3 Thu Feb 6 08:00:46 PST 2025 Stopped yugabyted using config /Users/hsunder/yugabyte-data/node3/conf/yugabyted.conf.
The PG_CRON_LEADER service has automatically moved to a new node, node-2 (west-2b) in this case:
If we query our table, you can see it was unaffected by this movement:
yugabyte=# SELECT * FROM tbl1 ORDER BY id; id | insert_time | data -----+----------------------------+--------------- 1 | 2025-02-06 07:58:01.158603 | cron was here 101 | 2025-02-06 07:59:01.10112 | cron was here 201 | 2025-02-06 08:00:01.086597 | cron was here 301 | 2025-02-06 08:01:01.150388 | cron was here 401 | 2025-02-06 08:02:01.035135 | cron was here 501 | 2025-02-06 08:03:01.102009 | cron was here (6 rows) yugabyte=# SELECT * FROM cron.job_run_Details ORDER BY runid; jobid | runid | job_pid | database | username | command | status | return_message | start_time | end_time -------+-------+---------+----------+----------+-------------------------------------------------+-----------+----------------+-------------------------------+------------------------------- 1 | 1 | 80276 | yugabyte | yugabyte | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 07:58:01.05932-08 | 2025-02-06 07:58:01.184609-08 1 | 2 | 80835 | yugabyte | yugabyte | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 07:59:01.048266-08 | 2025-02-06 07:59:01.130187-08 1 | 3 | 81414 | yugabyte | yugabyte | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 08:00:01.042518-08 | 2025-02-06 08:00:01.116429-08 1 | 101 | 82294 | yugabyte | yugabyte | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 08:01:01.044024-08 | 2025-02-06 08:01:01.162468-08 1 | 102 | 82860 | yugabyte | yugabyte | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 08:02:01.007901-08 | 2025-02-06 08:02:01.069329-08 1 | 103 | 83421 | yugabyte | yugabyte | INSERT INTO tbl1(data) VALUES ('cron was here') | succeeded | INSERT 0 1 | 2025-02-06 08:03:01.03503-08 | 2025-02-06 08:03:01.259665-08 (6 rows)
For more information about the extension check out the doc page.
Additional Notes
You may wonder why the cron jobs run on a single node instead of being distributed across multiple nodes. There are a couple of reasons for this:
- Distributed query execution. Even though the jobs are executed on one node, their execution will take advantage of the resources of other nodes via YugabyteDB features like automatic sharding, Predicate Push Down, and Cost Based Optimizer.
- Cron jobs in YugabyteDB are not resource-intensive. In PostgreSQL, cron is used to perform expensive maintenance operations like VACUUM, which are not needed in YugabyteDB. The most common use of cron is to run DDLs, like those by
pg_partman
.
We plan to implement distributed job scheduling in a future release as we continually assess and address our customers’ evolving needs in this area.
Conclusion
Reusing PostgreSQL’s code not only ensures YugabyteDB’s high compatibility with the PostgreSQL API, but also enables it to fully leverage the PostgreSQL extension ecosystem. This makes YugabyteDB an ideal choice for modern applications that demand both the familiarity of PostgreSQL and the resilience, high availability, and scalability of a cloud-native database.
The pg_cron
extension in YugabyteDB enables you to seamlessly schedule automated jobs to monitor and maintain your databases. Check out our docs page for more information on how you can incorporate pg_cron
into your YugabyteDB deployments.