Working days PostgreSQL function

Looking for a way to have a handy manner to calculate the number of working days between two dates ?! Here is a howto.

You’ll adjust the following according to your needs, besides feeding reference tables with worldwide non-working days plus working day weeks might be tricky if not costly.

The following examples use the ISO 3166-1 country code standard for two chars country codes.

Say you have two reference tables :

  • week working days (they might differ from one country to another)
  • public holidays

Let’s call the first one week_noworking_days :

CREATE TABLE "week_noworking_days" (
    "country" character(2) NOT NULL,
    "day" "int" NOT NULL,
);

Feed that table with days of the week (dow) that are no-working days per country : most countries have saturdays and sundays off (dow 0 and 6) but there are exceptions.

Have a second reference table holding bank holidays :

CREATE TABLE "bank_holidays" (
    "country" character(2) NOT NULL,
    "day" "date" NOT NULL,
    "label" "text"
);

Feed it easily with google calendars, e.g. https://www.google.com/calendar/ical/en.usa%23holiday%40group.v.calendar.google.com/public/basic.ics

Might not exaclty suit your needs though, there are paying solutions around otherwise.

Once you have the precedings set up the following PostgreSQL stored procedure will return the number of working days between two dates taking a list of country as parameter :

CREATE OR REPLACE FUNCTION working_days_between (IN _from date, IN _to date, IN _countries text[] DEFAULT ARRAY['US'::TEXT]) RETURNS int8 AS
$BODY$
  SELECT CASE WHEN _from <= _to THEN count(*) ELSE - count(*) END FROM
    (SELECT CASE WHEN _from <= _to THEN _from+x ELSE _to+x END AS dday
    FROM generate_series(1, abs(_to - _from)) x
  EXCEPT SELECT day::date FROM bank_holidays WHERE country = ANY (_countries)) AS days
  WHERE extract('dow' FROM dday) NOT IN (SELECT day FROM week_noworking_days where country = ANY(_countries));
$BODY$
LANGUAGE 'sql'
STABLE;

Useful ?!

Using ssh agent forwarding with a docker container

Say for some reasons such as deployment needs  you want some docker hosted process to use your local ssh keys without the need to enter a passphrase.

That’s where ssh-agent forwarding comes in handy !

ssh-agent is a program that keeps your keys in memory, the aim here is to share it with the docker container.

Here’s how to set it up on Ubuntu 16 running a Debian Jessie image :

docker run --rm -it --name container_name \
-v $(dirname $SSH_AUTH_SOCK):$(dirname $SSH_AUTH_SOCK) \
-e SSH_AUTH_SOCK=$SSH_AUTH_SOCK my_image

$SSH_AUTH_SOCK contains the path to the file socket used to communicate with agent, here : /run/user/1001/keyring/ssh

So we mount a volume of its directory on the container, that’s what the -v $(dirname $SSH_AUTH_SOCK):$(dirname $SSH_AUTH_SOCK) part does.

The -e SSH_AUTH_SOCK=$SSH_AUTH_SOCK part defines the environment variable on the container.

You may now use your local ssh keys within the container. Enjoy !

 

Simple database process parallelization on multi cpu database servers

Say you have a multi cpu database such as MySQL, MariaDB or Postgres SQL that can only spread one session load over one cpu.

This post aims at providing an easy & simple way to overcome this using a basic shell command : xargs !

seq 1 10 | xargs -P 4 -I {} echo "select * from {}"

This will send a sequence that will be individually processed by xargs which will fork off individual processes for each of them using 4 threads (you get it right it’s the -P parameter).

Sample output (the outcome order can be different at each execution) :

select * from 1
select * from 2
select * from 3
select * from 4
select * from 5
select * from 6
select * from 7
select * from 9
select * from 10
select * from 8

In case you don’t find this awesome you may stop reading this post right now.

In case you did, I’m sure you’re beginning to see how it can be used in so so many cases but in particular : our database processes.

Let’s take a simple postgres example :

psql -t -c "select array_agg(jobid) from myschema.myjobtable t group by mygroupcolumn" \
xargs -P 1 -I {} \
psql  -c "begin;  select * from myschema.myjob('{}')).*; commit;"

That one will retrieve a job list from database and run a myschema.myjob stored procedure taking a job id list as parameter on 1 process, looking at our database server load using top we can see only one cpu is used :

 top - 14:58:30 up 134 days,  4:20,  1 user,  load average: 0.57, 0.23, 0.08
Tasks:  38 total,   2 running,  36 sleeping,   0 stopped,   0 zombie
%Cpu0  :  0.3 us,  0.0 sy,  0.0 ni, 99.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  : 99.0 us,  0.3 sy,  0.0 ni,  0.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   6291456 total,  6280152 used,    11304 free,        0 buffers
KiB Swap:  2097152 total,   229232 used,  1867920 free.  6168656 cached Mem

Now running the same job with as many processes as our database server has cpu :

psql -t -c "select array_agg(jobid) from myschema.myjobtable t group by mygroupcolumn" \
xargs -P 4 -I {} \
psql  -c "begin;  select * from myschema.myjob('{}')).*; commit;"
top - 15:00:17 up 134 days,  4:22,  1 user,  load average: 1.27, 0.52, 0.20
Tasks:  41 total,   5 running,  36 sleeping,   0 stopped,   0 zombie
%Cpu0  : 98.3 us,  1.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  : 99.1 us,  0.9 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  : 91.2 us,  0.4 sy,  0.0 ni,  8.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  : 92.2 us,  1.3 sy,  0.0 ni,  6.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:   6291456 total,  6277252 used,    14204 free,        0 buffers
KiB Swap:  2097152 total,   321844 used,  1775308 free.  6070056 cached Mem

We can see the load of our database process is spread over the 4 cpu. Ain’t it cool ?!