By Vasudev Ram
Hi readers,
Control break reports are very common in data processing, from the earliest days of computing until today. This is because they are a fundamental kind of report, the need for which is ubiquitous across many kinds of organizations.
Here is an example program that generates a control break report and writes it to PDF, using xtopdf, my Python toolkit for PDF creation.
The program is named ControlBreakToPDF.py. It uses xtopdf to generate the PDF output, and the groupby function from the itertools module to handle the control break logic easily.
I've written multiple control-break report generation programs before, including implementing the logic manually, and it can get a little fiddly to get everything just right, particularly when there is more than one level of nesting (i.e. no off-by-one errors, etc.); you have to check for various conditions, set flags, etc.
So it's nice to have Python's itertools.groupby functionality handle it, at least for basic cases. Note that the data needs to be sorted on the grouping key, in order for groupby to work. Here is the code for ControlBreakToPDF.py:
So the itertools.groupby function basically provides roughly the same sort of functionality that SQL's GROUP BY clause provides (of course, when included in a complete SELECT statement). The difference is that with Python's groupby, you do the grouping and related processing in your program code, on data which is in memory, while if using SQL via a client-server RDBMS from your program, the grouping and processing will happen on the database server and only the aggregate results will be sent to your program to process further. Both methods can have pros and cons, depending on the needs of the application.
In my next post about Python, I'll use this program as one vehicle to demonstrate some uses of randomness in testing, continuing the series titled "The many uses of randomness", the earlier two parts of which are here and here.
- Enjoy.
- Vasudev Ram - Online Python training and consultingFollow me on Gumroad to be notified about my new products:
.gumroad-follow-form-embed { zoom: 1; } .gumroad-follow-form-embed:before, .gumroad-follow-form-embed:after { display: table; line-height: 0; content: ""; } .gumroad-follow-form-embed:after { clear: both; } .gumroad-follow-form-embed * { margin: 0; border: 0; padding: 0; outline: 0; box-sizing: border-box !important; float: left !important; } .gumroad-follow-form-embed input { border-radius: 4px; border-top-right-radius: 0; border-bottom-right-radius: 0; font-family: -apple-system, ".SFNSDisplay-Regular", "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: 15px; line-height: 20px; background: #fff; border: 1px solid #ddd; border-right: 0; color: #aaa; padding: 10px; box-shadow: inset 0 1px 0 rgba(0, 0, 0, 0.02); background-position: top right; background-repeat: no-repeat; text-rendering: optimizeLegibility; font-smoothing: antialiased; -webkit-appearance: none; -moz-appearance: caret; width: 65% !important; height: 40px !important; } .gumroad-follow-form-embed button { border-radius: 4px; border-top-left-radius: 0; border-bottom-left-radius: 0; box-shadow: 0 1px 1px rgba(0, 0, 0, 0.12); -webkit-transition: all .05s ease-in-out; transition: all .05s ease-in-out; display: inline-block; padding: 11px 15px 12px; cursor: pointer; color: #fff; font-size: 15px; line-height: 100%; font-family: -apple-system, ".SFNSDisplay-Regular", "Helvetica Neue", Helvetica, Arial, sans-serif; background: #36a9ae; border: 1px solid #31989d; filter: "progid:DXImageTransform.Microsoft.gradient(startColorstr=#5ccfd4, endColorstr=#329ca1, GradientType=0)"; background: -webkit-linear-gradient(top, #5ccfd4, #329ca1); background: linear-gradient(to bottom, #5ccfd4, #329ca1); height: 40px !important; width: 35% !important; } My Python posts Subscribe to my blog by emailMy ActiveState recipes
Hi readers,
Control break reports are very common in data processing, from the earliest days of computing until today. This is because they are a fundamental kind of report, the need for which is ubiquitous across many kinds of organizations.
Here is an example program that generates a control break report and writes it to PDF, using xtopdf, my Python toolkit for PDF creation.
The program is named ControlBreakToPDF.py. It uses xtopdf to generate the PDF output, and the groupby function from the itertools module to handle the control break logic easily.
I've written multiple control-break report generation programs before, including implementing the logic manually, and it can get a little fiddly to get everything just right, particularly when there is more than one level of nesting (i.e. no off-by-one errors, etc.); you have to check for various conditions, set flags, etc.
So it's nice to have Python's itertools.groupby functionality handle it, at least for basic cases. Note that the data needs to be sorted on the grouping key, in order for groupby to work. Here is the code for ControlBreakToPDF.py:
from __future__ import print_functionRunning it gives this output on the screen:
# ControlBreakToPDF.py
# A program to show how to write simple control break reports
# and send the output to PDF, using itertools.groupby and xtopdf.
# Author: Vasudev Ram
# Copyright 2016 Vasudev Ram
# http://jugad2.blogspot.com
# https://gumroad.com/vasudevram
from itertools import groupby
from PDFWriter import PDFWriter
# I hard-code the data here to make the example shorter.
# More commonly, it would be fetched at run-time from a
# database query or CSV file or similar source.
data = \
[
['North', 'Desktop #1', 1000],
['South', 'Desktop #3', 1100],
['North', 'Laptop #7', 1200],
['South', 'Keyboard #4', 200],
['North', 'Mouse #2', 50],
['East', 'Tablet #5', 200],
['West', 'Hard disk #8', 500],
['West', 'CD-ROM #6', 150],
['South', 'DVD Drive', 150],
['East', 'Offline UPS', 250],
]
pw = PDFWriter('SalesReport.pdf')
pw.setFont('Courier', 12)
pw.setHeader('Sales by Region')
pw.setFooter('Using itertools.groupby and xtopdf')
# Convenience function to both print to screen and write to PDF.
def print_and_write(s, pw):
print(s)
pw.writeLine(s)
# Set column headers.
headers = ['Region', 'Item', 'Sale Value']
# Set column widths.
widths = [ 10, 15, 10 ]
# Build header string for report.
header_str = ''.join([hdr.center(widths[ind]) \
for ind, hdr in enumerate(headers)])
print_and_write(header_str, pw)
# Function to base the sorting and grouping on.
def key_func(rec):
return rec[0]
data.sort(key=key_func)
for region, group in groupby(data, key=key_func):
print_and_write('', pw)
# Write group header, i.e. region name.
print_and_write(region.center(widths[0]), pw)
# Write group's rows, i.e. sales data for the region.
for row in group:
# Build formatted row string.
row_str = ''.join(str(col).rjust(widths[ind + 1]) \
for ind, col in enumerate(row[1:]))
print_and_write(' ' * widths[0] + row_str, pw)
pw.close()
$ python ControlBreakToPDF.pyAnd this is a screenshot of the PDF output, viewed in Foxit PDF Reader:
Region Item Sale Value
East
Tablet #5 200
Offline UPS 250
North
Desktop #1 1000
Laptop #7 1200
Mouse #2 50
South
Desktop #3 1100
Keyboard #4 200
DVD Drive 150
West
Hard disk #8 500
CD-ROM #6 150
$
So the itertools.groupby function basically provides roughly the same sort of functionality that SQL's GROUP BY clause provides (of course, when included in a complete SELECT statement). The difference is that with Python's groupby, you do the grouping and related processing in your program code, on data which is in memory, while if using SQL via a client-server RDBMS from your program, the grouping and processing will happen on the database server and only the aggregate results will be sent to your program to process further. Both methods can have pros and cons, depending on the needs of the application.
In my next post about Python, I'll use this program as one vehicle to demonstrate some uses of randomness in testing, continuing the series titled "The many uses of randomness", the earlier two parts of which are here and here.
- Enjoy.
- Vasudev Ram - Online Python training and consultingFollow me on Gumroad to be notified about my new products:
.gumroad-follow-form-embed { zoom: 1; } .gumroad-follow-form-embed:before, .gumroad-follow-form-embed:after { display: table; line-height: 0; content: ""; } .gumroad-follow-form-embed:after { clear: both; } .gumroad-follow-form-embed * { margin: 0; border: 0; padding: 0; outline: 0; box-sizing: border-box !important; float: left !important; } .gumroad-follow-form-embed input { border-radius: 4px; border-top-right-radius: 0; border-bottom-right-radius: 0; font-family: -apple-system, ".SFNSDisplay-Regular", "Helvetica Neue", Helvetica, Arial, sans-serif; font-size: 15px; line-height: 20px; background: #fff; border: 1px solid #ddd; border-right: 0; color: #aaa; padding: 10px; box-shadow: inset 0 1px 0 rgba(0, 0, 0, 0.02); background-position: top right; background-repeat: no-repeat; text-rendering: optimizeLegibility; font-smoothing: antialiased; -webkit-appearance: none; -moz-appearance: caret; width: 65% !important; height: 40px !important; } .gumroad-follow-form-embed button { border-radius: 4px; border-top-left-radius: 0; border-bottom-left-radius: 0; box-shadow: 0 1px 1px rgba(0, 0, 0, 0.12); -webkit-transition: all .05s ease-in-out; transition: all .05s ease-in-out; display: inline-block; padding: 11px 15px 12px; cursor: pointer; color: #fff; font-size: 15px; line-height: 100%; font-family: -apple-system, ".SFNSDisplay-Regular", "Helvetica Neue", Helvetica, Arial, sans-serif; background: #36a9ae; border: 1px solid #31989d; filter: "progid:DXImageTransform.Microsoft.gradient(startColorstr=#5ccfd4, endColorstr=#329ca1, GradientType=0)"; background: -webkit-linear-gradient(top, #5ccfd4, #329ca1); background: linear-gradient(to bottom, #5ccfd4, #329ca1); height: 40px !important; width: 35% !important; } My Python posts Subscribe to my blog by emailMy ActiveState recipes