Thursday, February 24, 2011

quick work_mem trick

One of my friend ended up with 4 million additional junk rows in a table due to an application bug. The table originally had around 54 million rows. Since the table was an important one, my friend asked me to delete those junk rows from the table as soon as possible with the following constraints.

  • No selective dump and restore because my friend had no exact way of identifying the junk rows
  • Cannot restore from backup because this needs downtime of the application. Moreover the table was in slony replication and this means I need to unsubscribe/subscribe the slony set to multiple slaves.
So my first task was to identify those junk rows. After a quick chat with developers I found that on triggering the application bug once again it created the same 4 million junk rows. Quickly in the staging environment we created a junk_del table with 4 million rows which has to be deleted from the source table named app_table. Now the task was easily solved with the following query.

DELETE FROM app_table WHERE row(site_inc_id,ad_inc_id) IN (SELECT site_inc_id, ad_inc_id FROM junk_del);

Next I wanted to get a rough estimate on the time taken to run the above query. On running EXPLAIN  it  revealed that its going to take 1 hour to execute which looked little bit on the higher side.
Digging around we increased the work_mem from the current default of 1MB to 50MB only for this session and as we expected the time reduced to roughly 40 minutes.

With full joy we executed in production and it finished in just 19 minutes. To summarize increasing work_mem was quick and worthy trick which really helped us. At the end of it my friend was happy for obvious reason.

Friday, February 18, 2011

Read Scalability with PostgreSQL/EnterpriseDB

Read Scalability with Slony and EnterpriseDB/PostgreSQL
   One of the startups which I consult started to grow leaps and bounds which challenged me to achieve Read Scalability. As a general wisdom Read Scalability with PostgreSQL/EnterpriseDB can be achieved through various techniques like partioning, sharding, replication etc. Based on our existing infrastructure, we achieved the scalability through Slony and EnterpriseDB from the current vertically scaled Database Master.
The interesting snippets in our setup is Slony communication across different data centers using private vpn tunnel and also stunnel setup between hosts involved in replication cluster for improved security. Morever since Master and Slaves were in different data centers, communication between them was impossible in few cases due to Network Security requirements thereby forcing us to create multiple clusters for replication. Maintenance with multiple slony clusters becomes little trickier when we have schema changes and unsusbscription/subscription of sets which is reserved for another blog post. The below architecture diagram shows our slony setup in more detail.



Cluster Creation Code
---------------------------------