Oct 20, 2014

Automated backup for PostgreSQL cluster on Windows with PowerShell

PostgreSQL base backup tool (pg_basebackup) was introduced with v9.1 and is primary used for creating a ready to go standby replica. Since v9.3 pg_basebackup supports WAL segments streaming via the -X stream option. With -X stream on, pg_basebackup will open a second connection to the server and start streaming the transaction log in parallel with the cluster files. The resulting backup file contains all the needed data to start a fresh replica or to restore the main cluster to it's original state. Version 9.4, currently in beta, will further improve pg_basebackup by allowing you to relocate not only the data store but also any table spaces you might have.

If you are running Postgres on Windows and you are looking for an automated way of doing full cluster backups then Backup-Postgres PowerShell script is a good starting point.

The Backup-Postgres script does the following:
  • checks if there is enough free space to make a new backup based on the last backup size (works only with a local backup path)
  • purges expired backups based on the supplied expire date
  • creates a new folder for each backup inside the root backup directory (the root path can be defined as local or network share)
  • calls pb_basebackup tool to begin a tar gzip backup of every table space, along with all required WAL segments (via "--xlog" option)
  • writes any encountered errors to Windows Event Log
  • writes backup elapsed time to Windows Event Log
The script can be download from TechNet Gallery or from GitHub Gist.

Configure PostgreSQL server

I'm assuming you have PostgreSQL version 9.3.x or newer installed on Windows Server 2012 or newer.

In order to make a base backup of the entire cluster using pg_basebackup tool you'll have to configure your server for streaming replication. PostgreSQL base backup uses the replication protocol to make a binary copy of the database cluster files and WAL segments without interfering with other connected clients. This kind of backup enables point-in-time recovery and can be used as a starting point for a streaming replication standby server.

Enable streaming replication:

Open postgres.conf located (by default, on a 64-bit install) in C:\Program Files\PostgreSQL\9.3\data\ and make the following changes:

wal_level = hot_standby
max_wal_senders = 3  
wal_keep_segments = 10
You should adjust the wal_keep_segments based on the amount of changes your server receives while in backup. By default, each WAL segment has 16MB, if you expect to have more than 160MB of changes in the time it will take to make the backup, then increase it.

Create a dedicated backup user in Postgres:

Open psql located in C:\Program Files\PostgreSQL\9.3\bin\, login as postgres and run the following command:

Allow streaming replication connections from pgbackup on locahost:

Open pg_hab.conf located in C:\Program Files\PostgreSQL\9.3\data\ and make the following changes:

host    replication    pgbackup    ::1/128    md5

Configure Windows server

Create a local Windows user named postgres. It doesn't need to have administrator rights, but it should have full access to the backup folder.
Log off from Windows and log in as postgres, navigate toC:\Users\postgres\AppData\Roaming\ and create a folder named postgresql. Inside postgresql create a file named pgpass.conf with the following content:

The pg_basebackup tool will look for this file to fetch the password.
Open Backup-Postgres.ps1 and modify the following variables to match your configuration:

# path settings
$BackupRoot = 'C:\Database\Backup';
$BackupLabel = (Get-Date -Format 'yyyy-MM-dd_HHmmss');

# pg_basebackup settings
$PgBackupExe = 'C:\Program Files\PostgreSQL\9.3\bin\pg_basebackup.exe';
$PgUser = 'pgbackup';

# purge settings
$ExpireDate = (Get-Date).AddDays(-7);
Now it's time to schedule the backup, open Windows Task Scheduler and create a new task. Setup the task to run whatever the user is logged on or not with highest privileges, use the postgres user for this. Add a recursive trigger, I've set mine to repeat every day indefinitely. You should carefully chose the best time to start the backup and that's when the server is less used. You should specify in the settings tab the rule Do not start a new instance if the task is already running, this will prevent running multiple backups in parallel.
Go to the Actions tab and add a new action:

powershell -ExecutionPolicy Bypass -File "C:\Jobs\Backup-Postgres.ps1"

Restore cluster from base backup

In order to restore a base backup with multiple table spaces, you'll have to extract each table space archive to it's original path. Since Windows doesn't have native support for tar.gz you can use the 7zip command line.
With 7zip you can extract a tar.gz archive without storing the intermediate tar file, 7zip can write to stdout and read from stdin using the following command:

7z x "base.tar.gz" -so | 7z x -aoa -si -ttar -o "C:\Program Files\PostgreSQL\9.3\data"
Restore steps:

1) Stop Postgres server
2) Delete the data folder content and all table spaces content (if you have enough free space, you should make a backup copy of the current data and table spaces)
3) Run the 7zip command and extract each archive to its corresponding folder
4) Create a recovery.conf file in data folder with the following content, specifying the postgres password:

standby_mode = 'on'
primary_conninfo = 'host=localhost port=5432 user=postgres password=PG-PASS'
5) Open pg_hba.conf file and comment all existing rules, this will prevent external clients from accessing the server while in recovery.
6) Start Postgres server. When Postgres starts it will process all WAL files and once recovery is finished the recovery.conf file gets renamed to recovery.done.
7) Restore pg_hba.conf to its original state and restart Postgres.

After getting used to the restore process you could automate it with PowerShell.

Oct 16, 2014

PostgreSQL unattended install on Windows Server with PowerShell

I often find myself in the situation where I need to install and configure PostgreSQL on a new VM running Windows. Because repetitive tasks are annoying and error prone, I've decided to automate this process as much as I can using PowerShell.

The Install-PostgreSQL PowerShell module does the following:
  • creates a local windows user that PostgreSQL will use (called postgres by default)
  • the password use for the creation of this account will be the same as the one used for PostgreSQL's postgres superuser account
  • creates postgres user profile
  • downloads the PostgreSQL installer provided by EnterpriseDB
  • installs PostgreSQL unattended using the supplied parameters
  • sets the postgres windows user as the owner of any PostgreSQL files and folders
  • sets PostgreSQL windows service to run under the postgres local user
  • creates the pgpass.conf file in AppData
  • copies configuration files to data directory
  • opens the supplied port that PostgreSQL will use in the Windows Firewall
The script can be download from TechNet Gallery or from GitHub Gist.


On the machine you want to install PostgreSQL, download Install-Postgres.zip file and extract it to the PowerShell Modules directory, usually located under Documents\WindowsPowerShell.
Open PowerShell as Administrator and run Import-Module Install-Postgres. Before running the unattended install you should customize the PostgreSQL configuration files located in Install-Postgres\Config directory.
You can also add a recovery.conf file if you plan to use this PostgreSQL cluster as a standby slave. All conf files located in Install-Postgres\Config will be copied to the PostgreSQL data directory once the server is installed.

Install PostgreSQL with defaults:
Import-Module Install-Postgres
Install-Postgres -User "postgres" -Password "ChangeMe!"
Install PostgreSQL full example:
-User "postgres"  
-Password "ChangeMe!"  
-InstallUrl "http://get.enterprisedb.com/postgresql/postgresql-9.3.5-1-windows-x64.exe"  
-InstallPath "C:\Program Files\PostgreSQL\9.3"  
-DataPath "C:\Program Files\PostgreSQL\9.3\data"  
-Locale "Romanian, Romania"  
-Port 5432  
-ServiceName "postgresql"

Oct 6, 2014

UX practices for newly born websites

You just launched your website and now you're eager to see if users actually like what you've done, but hold your horses and don't jump to conclusions very fast.  Let the user enjoy what you have created before throwing them a big modal screen with a survey. You'll lose them twice as fast as you gained them. 

I’m not saying that you shouldn't care but you should make smart choices when it comes to user feedback in the first month or so of your product. 

Here's what you can do:

  1. Analytics – the good old friend, it should be there from the beginning. Watch for everything: audience, technology, bounce rate, referrals, direct search (for this I would suggest using webmaster tools as well), I mean everything. 
  2. Video recording from site – the next best thing, you can record everything a user does on your site, what he clicks, how he scrolls and if he has difficulties finding stuff on the page or completing a form. It's easier to understand user behavior from this than from a heat map tool. I'm using Clicktale for this job, you can try it for free or you can find other similar products
  3. A/B testing – this is the ideal time, right at the beginning, before users get comfortable with the way your product looks. Throw some color on those buttons baby, see what pops! 
  4. Social media – even if you don't have a Twitter/Facebook page (really, why don’t you?) people are still going to talk about you, so make sure you search for your brand's name across multiple social media platforms

It's been a month or more? OK, take your surveys, don't forget to add some incentive (it's not the end of the world to give them some discounts), call people, interview clients and so on. Never be too aggressive, aggressive is obsolete!