PostgreSQL, one of the original open-source relational database management systems (RDBMS), is seeing a strong resurgence in the industry. A recent Gartner report states that “By 2018, more than 70% of new in-house applications will be developed on an open-source RDBMS, and 50% of existing commercial RDBMS instances will have been converted or will be in process."
This is not really surprising given the RDBMS’s long list of useful features and capabilities, large performance increases over the past 9.x releases, and full enterprise-grade support through companies like CitusData and EnterpriseDB.
PostgreSQL 9.1 introduced Foreign Data Wrappers (FDWs) to enable links with external data sources using standard database queries. This allows direct query access to multiple data sources while eliminating the need for expensive extract, transform, and load (ETL) operations. Many companies will implement a PostgreSQL database as an interface to a set of disparate data sources through FDWs, thereby eliminating the need to set up large amounts of local storage to replicate datasets seen in a traditional data warehouse.
With the release of PostgreSQL 9.2, the database supported JSON as a data type with support for filters on nested keys within the JSON text. In release 9.4, binary JSONB was added to improve performance by parsing the JSON text on INSERTS instead of at every query. JSONB also supports indexes on all keys to improve querying performance.
Planned support with PostgreSQL 9.5 includes new features such as Import Foreign Schema to easily create a foreign table in PostgreSQL and Grouping Sets and Cubes, as well as extending existing functionality (JSON operations, UPSERT, TABLE SAMPLE, etc.) and enhancing performance.
In terms of performance, PostgreSQL has often been considered reliable but slow by many in the database community, but the development community is working hard to change that perception. Tomas Vondra posted some performance comparisons on pgaddict.com that compared each version from 8.0 to 9.4 in TPC-DS workloads.
As you can see in the graph below, the current 9.4 release completed queries in less than 25% of the time it took the 8.0 release to complete the same queries. And releases 9.0 to 9.4 have seen the time required to run cut in half. Further improvements are on the way with the 9.5 release scheduled for Q4 of 2015.
Great features and great performance are wonderful, but PostgreSQL wouldn’t be seeing nearly the same traction if it weren’t for great enterprise-grade support from third parties like CitusData or EnterpriseDB.
Each company provides some add-ons to the base PostgreSQL release, as well as full support for their respective products. (Among other things, CitusData releases an open-source column-store extension for PostgreSQL to improve analytics queries, and EnterpriseDB provides foreign data wrappers for various databases, as well as numerous code contributions to the codebase.)
Each company also provides enterprise-grade support services for their respective distributions of PostgreSQL. EnterpriseDB even offers Oracle migration software and services to help customers migrate from the more expensive Oracle Database. It appears to be working: In 2014 EnterpriseDB reported a 70% year-on-year increase in software subscription revenue.
With this in mind, my team has added PostgreSQL to our application performance testing environment. We’re characterizing the performance of PostgreSQL against different SSDs with different workloads so we can provide configuration guidance for PostgreSQL systems to help customers gain the best value from their SSD purchases.
Our first report, just released on micron.com, shows huge performance gains with PostgreSQL when traditional rotating hard drives are replaced with enterprise-grade SSDs. It also shows that the standard practice of intelligently locating different database objects on different storage media works well in a PostgreSQL environment. (The numbers represented here are a modified version of the TPC-C as implemented by HammerDB.)
Just moving the WAL Logs from HDDs to SSDs improved transaction throughput (NOPM) by 51% while simultaneously reducing the average response time by 89%.
A future report will describe PostgreSQL scaling with a TPC-H-like workload as we scale the number of users and the number of drives. The preliminary data shows that PostgreSQL scales well as available storage throughput (I/O) resources are added, and I’m excited to share the results in the next few months.
With the wealth of features, great performance, and enterprise-support options, PostgreSQL is a great way to reduce software costs associated with proprietary alternatives and shift spending to hardware where huge performance gains are just a few drives away.
Be sure to read our first published PostgreSQL performance report mentioned above and connect with us on Twitter @MicronStorage. I look forward to your questions and comments regarding the work my team is doing in the application acceleration area.