provides software and services that enable enterprises
Live Chat 1-888-673-6564

Open Source Software Technical Articles

  • Home
  • Search
  • Contact Us
  • Products and Support
  • Services
  • Enterprise OSS Blog
  • Wazi Technical Blog
  • About Wazi
  • Attributions and Licensing
  • Supply Chain Compliance
  • How to Contribute
  • Contributors
  • Resources Library
  • Cloud Services
  • Partners
  • Customers
  • Community
  • Company
  • Careers
  • News and Events

Subscribe to Wazi by Email

Your email:


Enterprise Developer Support 24 x 7, Get a Support Quote Now!


click-here-to-chat-with-an-online-representative

download-oss-discovery

Latest Posts

  • The secret to great reporting with Drupal 7
  • A more colorful LibreOffice unveiled
  • Toward a more colorful LibreOffice
  • Flexible administration with Puppet's Facter and templates
  • Knock for OpenSSH
  • Get more out of phpMyAdmin
  • Image annotation in GIMP, Dia, and OpenOffice Draw
  • Solr, Drupal 7, and faceted search
  • Using FreeNAS' new full disk encryption for ZFS
  • Create distributed storage with Gluster

Connect with Us!

Current Articles | RSS Feed RSS Feed

Window Functions: PostgreSQL's Best-Kept Secret

Posted by Cameron Laird on Fri, Nov 18, 2011
  
Email This Email Article  
Tweet  
  

The PostgreSQL relational database management system (RDBMS) extends the standard SQL3 it implements with several conveniences for application development. Among the most useful extensions are window functions, which perform a calculation across a set of table rows that are somehow related to the current row.



Suppose you want to know the total media market by state from a reference tabulated in terms of city (or metropolitan) listings. You can write a query with a window function like this:




SELECT state_name, city_name, market_size, SUM(market_size)
OVER (PARTITION BY state_name) FROM markets;


and see something like:



Example population data
state_namecity_namemarket_sizesum(market_size)
California Bay Area 4.3 20.3 California Los Angeles 12.8 20.3 California San Diego 3.1 20.3 Illinois Chicago 9.5 9.5 New York NYC 18.9 20.0 New York Buffalo 1.1 20.0 ...


A slightly more involved expression might report, for instance:




    • quantitative comparison of sales to a particular customer with the average for all customers in the same sector;

    • the ranks of competitive swimmers within their own heats;

    • statistics on student performance only for classes that were among the top four in attendance within their departments;

    • normalized reactivity for certain biological reagants that lie within a specified range of molecular weights.



The examples above are typical of questions that come up in real-world reporting requirements. Window functions make their expression concise and maintainable.



Syntax of Window Functions



In PostgreSQL, you can create a window function by replacing any use of a function in a SELECT list, or its associated ORDER BY clause, with an OVER clause immediately after the function. In an ordinary select statement such as:




SELECT emp_id, salary, AVG(salary) FROM salaries;


the average is over all reported employees. Contrast this with a statement that uses a windows function:




SELECT emp_id, salary, AVG(salary) OVER(PARTITION by site) FROM salaries;


The first two columns of these two reports will be identical. The third column in the latter case, though, will show averages only for people at the same site. All employees working in Miami share one average, all those in Toronto have a different average, and so on.



Some window functions depend on the order of their inputs. You can control this order with an optional ORDER BY clause. ORDER BY can be particularly useful when considering budget problems, as this example shows:




SELECT request, sum(request)
FROM projects ORDER BY priority;


Example proposal data
requestsum(request)
35 250
110 250
25 250
80 250


Now change the query to:




SELECT request, sum(request) OVER (ORDER BY priority)
FROM projects ORDER BY priority;


and notice the subtly different result:



Example cumulative proposal data
requestsum(request)
35 35
110 145
25 170
80 250


One way to read this table in English is: "If we fund only the top-priority project, our total expense will be 35; if we fund the top two in priority, we'll spend 145; ..." This is a view of data that decision-makers frequently prize – and a window function makes it easy to see!


19a98812-f823-48dc-841e-bf029c63c6d7

Alternatives



It's certainly possible to compute the same results without window functions; plenty of applications do. Many retrieve items from a table into a host language such as Java or Ruby, then use procedural logic in the host language to calculate the same "windows." Among other disadvantages, this approach has the potential to transmit large volumes of undisplayed intermediate data over the network, and thus slow overall performance.



Another alternative is to use stored procedures or user-defined functions on the RDBMS side. While this minimizes runtime network traffic, it generally still falls short of window functions in clarity and query performance. When written from scratch, code to compute a window function often takes four to 10 times as many lines as the simpler window function-based query.



Scope and Status of Window Functions in PostgreSQL



The SQL/OLAP amendment to SQL:1999 (SQL:1999 is a synonym for SQL3) introduced a <window> clause to act on collections or partitions of data. Not long after this, the SQL:2003 standard formalized window functions as described here. The subsequent SQL:2008 standard (labeled "SQL-2008" in some Postgres documents) slightly refined the definition of window functions.



PostgreSQL gained its window functions with the release of 8.4 in mid-2009. Postgres targets the SQL:2008 definition with its implementation. The principal difference between the Postgres implementation and the SQL:2008 definition has to do with specialized treatment of NULLS; see the Note at the bottom of the Window Functions documentation page for more details. Also, in Postgres, any built-in or user-defined aggregate function, such as count, max, bit_or, or xmlagg, is available for computation as a window function.



Many of the queries or reports that arise in business intelligence (BI) and analytical applications can be expressed naturally in terms of window functions. When you prepare a table where a common value ties together several distinct rows, it's worth your time to investigate whether your calculation can best be done as a window function. You will save both your own development time, and runtime resource usage, when you write window functions. They're among the best and most useful standardized extensions to SQL3.

Follow @openlogic
Follow @CloudSwing

This work is licensed under a Creative Commons Attribution 3.0 Unported License
Creative Commons License.Follow @openlogic
Follow @OSCloudServices

This work is licensed under a Creative Commons Attribution 3.0 Unported License
Creative Commons License.
Tags: Technical, PostgreSQL, Ruby, Tips & Tricks, Database

Comments

Currently, there are no comments. Be the first to post one!
Post Comment
Name
 *
Email
 *
Website (optional)
Comment
 *

Allowed tags: <a> link, <b> bold, <i> italics

Loading...
Error sending email
Email sent successfully

Email article
Email To : 
Your name : 
Message : (maximum 200 characters)
Home | Search | Contact Us | Products and Support | Services | Enterprise OSS Blog | Wazi Technical Blog | Resources Library | Cloud Services | Partners | Customers | Community | Company | Careers | News and Events
Products
OpenLogic Exchange (OLEX)
License Compliance Module
OSS Discovery
OSS Deep Discovery
OpenUpdate
Services
Open Source Support
CentOS Support
Scanning & Compliance
Open Source Training
Professional Services
Solutions
Support & Indemnification
Open Source Governance
Open Source Scanning
Open Source Provisioning
Consulting & Training
Contact Us
1-888-673-6564


© 2013 OpenLogic, Inc. All rights reserved.
Site Map  |  Privacy Policy