Sunday, February 25, 2018

rrd data in postgres?

source


With the latest advances in PostgreSQL (and other db’s), a relational database begins to look like a very viable TS storage platform. In this write up I attempt to show how to store TS in PostgreSQL. (2016-12-17 Update: there is a part 2 of this article.)
A TS is a series of [timestamp, measurement] pairs, where measurement is typically a floating point number. These pairs (aka “data points”) usually arrive at a high and steady rate. As time goes on, detailed data usually becomes less interesting and is often consolidated into larger time intervals until ultimately it is expired.

The obvious approach

The “naive” approach is a three-column table, like so:
1
 CREATE TABLE ts (id INT, time TIMESTAMPTZ, value REAL);
(Let’s gloss over some details such as an index on the time column and choice of data type for time and value as it’s not relevant to this discussion.)
One problem with this is the inefficiency of appending data. An insert requires a look up of the new id, locking and (usually) blocks until the data is synced to disk. Given the TS’s “firehose” nature, the database can quite quickly get overwhelmed.
This approach also does not address consolidation and eventual expiration of older data points.

Round-robin database

A better alternative is something called a round-robin database. An RRD is a circular structure with a separately stored pointer denoting the last element and its timestamp.
A everyday life example of an RRD is a week. Imagine a structure of 7 slots, one for each day of the week. If you know today’s date and day of the week, you can easily infer the date for each slot. For example if today is Tuesday, April 1, 2008, then the Monday slot refers to March 31st, Sunday to March 30th and (most notably) Wednesday to March 26.
Here’s what a 7-day RRD of average temperature might look as of Tuesday, April 1:
1
2
3
4
5
Week day: Sun  Mon  Tue  Wed  Thu  Fri  Sat
Date:     3/30 3/31 4/1  3/26 3/27 3/28 3/29
Temp F:   79   82   90   69   75   80   81
                    ^
                    last entry
Come Wednesday, April 2nd, our RRD now loooks like this:
1
2
3
4
5
Week day: Sun  Mon  Tue  Wed  Thu  Fri  Sat
Date:     3/30 3/31 4/1  4/2  3/27 3/28 3/29
Temp F:   79   82   90   92   75   80   81
                         ^
                         last entry
Note how little has changed, and that the update required no allocation of space: all we did to record 92F on Wednesday is overwrite one value. Even more remarkably, the previous value automatically “expired” when we overwrote it, thus solving the eventual expiration problem without any additional operations.
RRD’s are also very space-efficient. In the above example we specified the date of every slot for clarity. In an actual implementation only the date of the last slot needs to be stored, thus the RRD can be kept as a sequence of 7 numbers plus the position of the last entry and it’s timestamp. In Python syntax it’d look like this:
1
[[79,82,90,92,75,80,81], 2, 1207022400]

Round-robin in PostgreSQL

Here is a naive approach to having a round-robin table. Carrying on with our 7 day RRD example, it might look like this:
1
2
3
4
5
6
7
8
9
week_day | temp_f
---------+--------
     1   |   79
     2   |   82
     3   |   90
     4   |   69
     5   |   75
     6   |   80
     7   |   81
Somewhere separately we’d also need to record that the last entry is week_day 3 (Tuesday) and it’s 2008-04-01. Come April 2, we could record the temperature using:
1
UPDATE series SET temp_f = 92 WHERE week_day = 4;
This might be okay for a 7-slot RRD, but a more typical TS might have a slot per minute going back 90 days, which would require 129600 rows. For recording data points one at a time it might be fast enough, but to copy the whole RRD would require 129600 UPDATE statements which is not very efficient.
This is where using PostgrSQL arrays become very useful.

Using PostgreSQL arrays

An array would allow us to store the whole series in a single row. Sticking with the 7-day RRD example, our table would be created as follows:
1
2
3
CREATE TABLE ts (dp DOUBLE PRECISION[] NOT NULL DEFAULT '{}',
                 last_date DATE,
                 pos INT);
(Nevemind that there is no id column for now)
We could populate the whole RRD in a single statement:
1
INSERT INTO ts VALUES('{79,82,90,69,75,80,81}', '2008-08-01', 3);
…or record 92F for Wednesday as so:
1
UPDATE ts SET dp[4] = 92, last_date = '2008-04-02', pos = 4;
(In PostgreSQL arrays are 1-based, not 0-based like in most programming languages)

But it could be even more efficient

Under the hood, PostgreSQL data is stored in pages of 8K. It would make sense to keep chunks in which our RRD is written to disk in line with page size, or at least smaller than one page. (PostgreSQL provides configuration parameters for how much of a page is used, etc, but this is way beyond the scope of this article).
Having the series split into chunks also paves the way for some kind of a caching layer, we could have a server which waits for one row worth of data points to accumulate, then flushes then all at once.
For simplicity, let’s take the above example and expand the RRD to 4 weeks, while keeping 1 week per row. In our table definition we need provide a way for keeping the order of every row of the TS with a column named n, and while we’re at it, we might as well introduce a notion of an id, so as to be able to store multiple TS in the same table.
Let’s start with two tables, one called rrd where we would store the last position and date, and another called ts which would store the actual data.
1
2
3
4
5
6
7
8
9
CREATE TABLE rrd (
  id SERIAL NOT NULL PRIMARY KEY,
  last_date DATE,
  last_pos INT);

CREATE TABLE ts (
  rrd_id INT NOT NULL,
  n INT NOT NULL,
  dp DOUBLE PRECISION[] NOT NULL DEFAULT '{}');
We could then populate the TS with fictitious data like so:
1
2
3
4
5
6
INSERT INTO rrd (id, last_date, last_pos) VALUES (1, '2008-04-01', 24);

INSERT INTO ts VALUES (1, 1, '{64,67,70,71,72,69,67}');
INSERT INTO ts VALUES (1, 2, '{65,60,58,59,62,68,70}');
INSERT INTO ts VALUES (1, 3, '{71,72,77,70,71,73,75}');
INSERT INTO ts VALUES (1, 4, '{79,82,90,69,75,80,81}');
To update the data for April 2, we would:
1
2
UPDATE ts SET dp[4] = 92 WHERE rrd_id = 1 AND n = 4;
UPDATE rrd SET last_date = '2008-04-02', last_pos = 25 WHERE id = 1;
The last_pos of 25 is n * 7 + 1 (since arrays are 1-based).
This article omits a lot of detail such as having resolution finer than one day, but it does describe the general idea. For an actual implementation of this you might want to check out a project I’ve been working on: tgres

Innovation engine

source

Practically every company innovates. But few do so in an orderly, reliable way. In far too many organizations, the big breakthroughs happen despite the company. Successful innovations typically follow invisible development paths and require acts of individual heroism or a heavy dose of serendipity. Successive efforts to jump-start innovation through, say, hack-a-thons, cash prizes for inventive concepts, and on-again, off-again task forces frequently prove fruitless. Great ideas remain captive in the heads of employees, innovation initiatives take way too long, and the ideas that are developed are not necessarily the best efforts or the best fit with strategic priorities.
Most executives will freely admit that their innovation engine doesn’t hum the way they would like it to. But turning sundry innovation efforts into a function that operates consistently and at scale feels like a monumental task. And in many cases it is, requiring new organizational structures, new hires, and substantial investment, as the “innovation factory” Procter & Gamble built in the early 2000s did.
For the past decade we’ve been helping organizations around the globe strengthen their innovation capabilities, and that work has taught us that there’s an important intermediate option between ad hoc innovation and building an elaborate, large-scale innovation factory: setting up a minimum viable innovation system (MVIS).
We borrow the language for this term from the world of lean start-ups, where “minimum viable product” denotes a stripped-down functional prototype used as a starting point for developing a new offering. “Minimum viable innovation system” refers to the essential building blocks that allow a company to begin creating a reliable, strategically focused innovation function. An MVIS will ensure that good ideas are encouraged, identified, shared, reviewed, prioritized, resourced, developed, rewarded, and celebrated. But it will not require years of work, fundamental changes to the way the organization runs, or a significant reallocation of resources.
What it will require is senior management attention—most critically from some member of the top leadership team. That might be the chief executive officer or a chief innovation officer, but it doesn’t have to be. If you’re responsible for innovation in your company at the highest level, we’re talking to you. With a little help from other executives and innovation practitioners, you can set up an MVIS by completing four basic steps in no more than 90 days, with limited investment and without hiring anyone extra. And as early success builds confidence in your innovation capabilities, it will set the stage for further progress.

Day 1 to 30: Define Your Innovation Buckets

There’s no shortage of terms for innovation. Sustaining innovations, incremental innovations, continual improvement programs, organic-growth initiatives. Disruptive innovations, breakthrough innovations, new-growth initiatives, white-space and blue-ocean strategies. But strategically speaking, all innovations fall into one of two buckets. In one are innovations that extend today’s business, either by enhancing existing offerings or by improving internal operations. In the other are innovations that generate new growth by reaching new customer segments or new markets, often through new business models.
The MVIS encompasses both types of innovation, but it’s critical that everyone involved in an MVIS (or any innovation program) understand the difference between the two buckets. The failure to do so causes many companies to either discount the importance of innovations that strengthen the ongoing business or to demand too much revenue from the new-growth initiatives too early. Agreeing on what to call the two buckets is a good starting place. For the purposes of this discussion we’ll call the first one “core innovations” and the second “new-growth innovations.”
Innovation projects meant to strengthen the core should be tied to the current strategy and managed mostly within the main business’s organizational structure. (The MVIS will keep track of them, though, as you’ll see later on.) They’re the projects expected to offer rapid and substantial returns in the near future and need to be funded at scale.
Conceivably, all your current innovation projects may be core. But what of the future? Will they be enough to enable you to reach your longer-term financial targets? If your company is typical, the answer is no. There will be a gap between your growth goals and what your current operations and core innovations can generate. It’s the purpose of the new-growth innovations to fill that gap.
New-growth initiatives push the frontier of your strategy by offering new or complementary products to existing customers, moving into adjacent product or geographic markets, or developing something utterly original, perhaps delivered in a completely novel way. The larger your company’s growth gap, the further from your core those innovation efforts will likely need to be, and the longer it will take to realize substantial revenue from them.
You can work up a serviceable estimate of the size of the gap if you spend up to two weeks developing rough but honest numbers for the revenue and profits your current operations will deliver in the next five years and then compare them with your five-year goals. This will give you a basic sense of what percentage of your time, effort, and resources needs to be focused on core innovation, and what percentage on new-growth efforts, and how ambitious the latter need to be.
When your growth gap is fairly large, you may wish to subdivide your new-growth efforts so that you can map them to different possible directions for future growth. This being a minimum innovation effort, we suggest designating no more than three such categories.
Manila Water is a public/private partnership in the Philippines that has done a good job of mapping its core and new-growth innovation efforts to its current and future goals. In 1997 it received a concession to provide water services to the eastern part of the city of Manila, covering about 6 million people. At the time only about 30% of the city’s households had reliable access to water. In the next 16 years the company made it available to almost every home in the area and approached international levels on key benchmarks such as pressure, purity, and turbidity.
The organization couldn’t have achieved such impressive performance without being highly innovative in the way it solved the challenges of operating within the chaotic environment of the Philippines. To improve the productivity of the core, it needed to keep pursuing those kinds of innovations—which it dubbed “core optimization.”
However, in 2013, CEO Gerardo Ablaza recognized that core optimization would not be enough to reach Manila Water’s long-term growth goals. The company’s calculations made it clear that over the next few years, 80% of its growth had to come from outside the core.
To fill such a large gap, Ablaza and his leadership team decided that the new-growth initiatives should fall into two broad categories: The first was adjacency moves, in which Manila Water would export its core business model to other geographic markets. The second was the pursuit of new kinds of offerings entirely, beyond the core mission of providing clean water.
That move presented Manila Water with a challenge: The more novel a category of innovation is, the more it will run counter to systems and processes designed to strengthen and support the current business. The next three pieces of the MVIS puzzle help companies overcome that difficulty.

Day 20 to 50: Zero In on a Few Strategic Opportunity Areas

Sophisticated innovators like Procter & Gamble, W.L. Gore, and Apple have elaborate processes to tie their various types of innovation to their short- and longer-term growth goals. The MVIS also does this, but in a simpler way. It makes efficient use of limited resources and productively channels innovators’ passions by focusing innovation efforts on a small number of strategic opportunity areas. These are areas that fit within your new-growth buckets and seem large enough to take the needed bite out of that growth gap.
How do you pick them? You could spend months or even years conducting a comprehensive analysis, but of course we don’t recommend that. Instead we suggest doing three weeks of research, with the aid of a handful of executives you expect will eventually be involved in your innovation efforts. Have them meet with at least a dozen customers, probing for unmet needs that could be the foundation of a new-growth innovation, and investigate new developments in and around your industry. Also, take a close look at new-growth efforts currently bubbling up inside your organization. These sometimes signal strategic objectives that aren’t yet getting proper attention from senior management. For example, when one financial services company examined the ideas emerging organically within its ranks, it saw that a number of them involved sophisticated analysis of customer data, even though it hadn’t yet announced that “big data” would be a strategic imperative. Competitive forces and customer demands had naturally begun to attract organizational energy.
Next, lock the members of the senior leadership team in a room for an afternoon, share the findings, and instruct them not to leave until they have identified three strategic opportunity areas that each combine the following:
  • A job that many potential customers need to do that no one is addressing very well.
  • Either a technology that will enable customers to do that job much more easily, cheaply, or conveniently, or a change in the economic, regulatory, or social landscape that is greatly intensifying the need for that job.
  • Some special capability of your company that competitors can’t easily copy that will give you an advantage in seizing this opportunity.
Manila Water used those criteria to identify a number of strategic opportunity areas, including treating wastewater generated by commercial enterprises. Manila Water selected this area because it recognized that a great many enterprises across the city produced wastewater. What’s more, increasing regulatory scrutiny meant that they could not continue to flush wastewater down the drain or casually dump it elsewhere, as they had been doing. As for a competitive advantage, Manila Water not only had substantial experience in treating wastewater but, as the enterprises’ water supplier, already knew these potential customers well, giving it a head start in developing the best solution for their needs.
If you take care to combine all three criteria, you can avoid some of the more common innovation traps, such as pursuing a phantom opportunity only because it seems so big that there must be money in it somewhere, or wandering into a new market where you have no natural advantage. Manila Water had initially considered, for instance, whether it might expand into advertising. After all, every month it was sending out millions of paper bills, on which someone might want to advertise, and the Filipino ad market was growing. But ultimately that area was deemed too far from the company’s existing capabilities to be reasonably defended against more-experienced competitors.
Identifying strategic opportunity areas will direct the energies of forward-thinking employees who might be playing with ideas at the fringes of your organization. It also helps highlight where people might be wasting their time. After all, its corollary is that it defines what you are not going to do. That’s something we’ll focus on in the next section.

Day 20 to 70: Form a Small, Dedicated Team to Develop the Innovations

Because you’re trying to set up a minimum innovation capability, you may think you could layer it into your existing organization by setting aside some time for everyone to innovate. But consider this: About 75% of venture-capital-backed start-ups fail to return one penny to their investors. Fewer than 50% of start-ups make it to their fourth birthday. These are businesses with dedicated teams whose members are pouring every ounce of their souls into succeeding. What hope does a group of part-timers have to beat the odds?
Even a minimum viable innovation system requires that at least one person (and typically more) get up every morning and go to sleep every night thinking about nothing but innovation. (That won’t be you, though it should be someone who reports to you. As the executive sponsor, you presumably have other responsibilities as well.)
But there’s no need to recruit an army. Manila Water created a three-person team to explore the first two strategic areas it identified. The team then developed a backup list of half a dozen extra opportunities that could be pursued if the first set didn’t pan out. We generally recommend starting in this focused way rather than setting up a large innovation function, which often creates work for itself to justify its existence. That said, we do recommend building the capacity to handle at least two ideas at once, since there inevitably will be course corrections and failure.
Two obstacles, in our experience, may daunt companies at this stage: a lack of resources and a lack of people with pertinent experience to staff the MVIS. Here’s how to overcome them:

Free up resources.

If you’re encountering the first problem, it’s time to bring your invisible innovation efforts out of the shadows. The odds are high that they include “zombie projects”—walking undead that shuffle along slowly but aren’t headed anywhere. Sometimes companies unwittingly spawn zombies by setting up redundant teams for core initiatives. Sometimes new-growth zombies lurk in an organization’s dark corners in unsanctioned efforts.
Finding the bulk of your zombies is a straightforward process: List all the innovation efforts that have the equivalent of at least one half-time employee working on them. Try to identify which market each idea targets. Estimate the size of the opportunity, and inventory the resources currently devoted to it. Which efforts enhance your core strategy and which focus on strategic opportunity areas? It should be fairly easy to identify the projects that are neither and are frittering away your resources.
In 2011, when Francesco Vanni d’Archirafi, then CEO of Citi Transaction Services, pushed his organization to track its innovation efforts, substantial duplication and fruitless efforts came to light. CTS streamlined its innovation portfolio by consolidating 75 mobile projects into 10, which liberated resources and increased strategic focus.
Identifying zombies is easier than killing them off, however. Many people find it hard to throw in the towel on a project that might somehow, someday work. And few people have the fortitude to admit that their project is essentially the same as someone else’s.
As a start, consider instituting “zombie amnesty,” whereby people can admit that their idea is too small, not strategic enough, or too riddled with difficult-to-address risks to justify further funding. Make it clear that there will be no penalty for purging a project. In fact, hold a celebration to honor those who do. They’re heroes and should be treated as such. One round of amnesty will probably release enough resources to get your innovation team up and running, although it’s a good idea to hold the exercise every couple of years to ensure that efforts haven’t wandered off course.

Learn by doing.

If your organization is just starting to focus on innovation, it’s unlikely that anyone you appoint to the team will have much experience with it. And yet we promised that you could get started in 90 days without hiring anyone. How?
Over the years, innovation thinkers and practitioners have offered up a wealth of best practices aimed at making new-growth innovation as orderly as the processes for manufacturing and marketing mature products. Companies like Intuit, Syngenta, and General Electric have elaborate systems to spread those practices throughout their organizations. In essence these systems combine some formal training with immersion in an actual product-development experience. A simpler version of this is an effective starting point for a neophyte MVIS team.
As experienced innovators, we use process checklists to make sure we haven’t left out any critical step. Those newer to innovation can do the same. Have your team devour the literature of best innovation practices and develop its own checklist, hang it on the wall, and refer to it frequently. (For some of our favorite books, see the sidebar “An Innovator’s Bookshelf.”) The team members will develop their skills as they work through problems, but the checklist will help ensure that they don’t go off the rails in the meantime.
A nonprofit, the Settlement Music School, used this approach to reach new student populations in inventive ways. Founded in 1908, SMS offered classes in jazz and classical music to 5,000 students—primarily children—weekly in the Philadelphia area. Executive director Helen Eaton hoped to transform SMS’s facilities into a “third place,” like a house of worship (or a neighborhood Starbucks), that could provide adults with a sense of community. After dividing her innovation ideas into core and new growth, she identified four strategic opportunity areas she called “best in class,” “community arts changes lives,” “innovation meets changing needs,” and “smart solutions for sustainability and growth.”
Led by community engagement manager Joseph Nebistinsky, a small team of innovators, which included several branch and department directors, began to conceive of new offerings in the “community arts changes lives” area, using our best-practices checklist. After two days of training, they went into the field to interview prospective customers about what offerings might enrich their lives. In his discussions, Germantown branch director Eric Anderson saw a recurrent theme: a desire for adults to reclaim their youth, meet new people, and dust off that guitar they’d stopped strumming in college. What if we created some way for adults to jam together in a band, he wondered? The team drafted a three-page brief outlining the idea, which ultimately became known as “Adult Rock Band.”
In an initiative so far from SMS’s core, many uncertainties needed to be resolved. How would the school attract students? What type of music should they play? One hook could be a culminating concert where the jam band would perform, but maybe the program should more open-ended, with no big event?
Like seasoned innovators, the team laid out the assumptions underpinning a complete business model, which included how the program would be designed, marketed, and delivered. The idea was that a group of like-minded adults would come together and practice under the tutelage of an expert instructor. The class could continue indefinitely, separated into 10-week sessions; at the end of each session the band would hold a concert in the school’s performance space. As instructor Ed Wise told a local publication, “There’s something good for the soul about strapping on the old Fender and banging out a few Jack Bruce lines.”
Would that work? The members of the team had spent enough time with customers to be confident that Adult Rock Band addressed a real market need, and their back-of-the-envelope analysis showed that the program would break even if an individual branch could attract just eight participants. They set out to test the idea by running a pilot at a single branch and then expanding to two more.
The program did well at two branches but struggled at the third. Rather than walk away from the perceived failure, the school did a careful analysis. It showed that SMS needed to fine-tune the classes to the socioeconomic makeup of its local branches, taking into account each community’s musical traditions, cultural traditions, and social networks. As the school continued to innovate and look into why certain programs took hold in one community and not in another, the MVIS team found it could begin to predict the success rates of new offerings. Its success helped SMS earn a coveted grant from the Pew Charitable Trusts to support further investment in innovative programs.

Day 45 to 90: Create a Mechanism to Shepherd Projects

If you have robust planning and budgeting systems, by all means use them for your core innovation efforts. But new-growth innovations call for an approach that borrows from venture capital practices. Any entrepreneur who’s been backed by VCs will tell you that they operate within a system that’s just as disciplined as a traditional corporation’s annual budgeting cycle. But it’s a sharply different discipline, one designed to manage strategic uncertainty.
Begin by forming a group of senior leaders who, from then on, will have the autonomy to make decisions about starting, stopping, or redirecting new-growth innovation projects. Don’t just replicate the current executive committee, however. If you do, it will be too easy for group members to default to their corporate-planning mindset or to let day-to-day business creep into discussions about innovations meant to fulfill long-term goals. Manila Water, for instance, picked four members of its top management team to serve on what it called the New Services Review Committee, which met every few weeks to help teams working on new-growth ideas.
In overseeing projects, this group should copy some standard VC operating procedures:
  • Venture capital partners often disagree about investment opportunities. In fact, seasoned VCs will tell you that the best investments are the most polarizing. Every project in your MVIS should have a senior executive sponsor or champion who believes in it deeply, but you shouldn’t require approval from the entire shepherding group to go ahead.
  • A decision to invest in a start-up is considered very carefully, but most day-to-day spending decisions are left to the start-up’s CEO. Corporate innovation shepherds should set a threshold investment amount that project teams can spend themselves without asking for leadership approval.
  • Major VC funding doesn’t follow quarterly or annual budget cycles. When a start-up resolves a key risk, it gets further investment. (In Manila Water’s case, for instance, significant expansion capital was contingent on commercial clients’ signing water treatment contracts, rather than just saying they would.) And when a big issue arises, the board of a venture-backed company gathers within 36 hours. You should ensure that your shepherds are likewise capable of assembling and making decisions that quickly.
Venture capitalists, of course, don’t need to concern themselves with integrating their start-ups into a larger organization. Corporate shepherds, by contrast, are responsible for helping strengthen their whole organization’s innovation capabilities.
This is something that Mary Jo Haddad, who was the CEO of Toronto’s Hospital for Sick Children from 2004 to 2013, understood when she kicked off a major innovation effort there in 2010. Haddad created a shepherding mechanism: an 18-person cross-functional team called the Innovation Working Group, which was armed with $250,000 in funding. The IWG helps innovators understand the needs of users, test prototypes, make adjustments, and then build scale. It also works to identify latent organizational innovation talent by running workshops that gather ideas from staff, patients, families, and the public and gives employees with promising proposals the opportunity to step out of their day jobs for a while to push their ideas forward. Equally important, the IWG runs an annual Innovation Expo, which celebrates innovators who experiment with new ideas, regardless of whether they succeed or fail.
One area that absolutely cannot be shortchanged is personnel. If you have no one fully focused on new growth, you’ve decided not to focus on new growth.
While an MVIS approach avoids the arduous work of rewiring a company’s systems for performance management, budgeting, and supplier management, it has a downside: It requires senior leaders to get involved in those issues on an ad hoc basis. For instance, at one organization a high-performing employee was in danger of losing a promotion because the innovative business she was helping build didn’t cross a revenue threshold set by corporate HR’s advancement policies. But her responsibilities were at least equal to those of many others who did qualify for promotion, and there were clear signs that, managed appropriately, her business could deliver substantial long-term revenue. Her unit leader stepped in to preempt the HR decision.
You might not want to spend time mired in these types of discussions forever. So at some point you may wish to integrate an MVIS into the broader organization—the subject of the next section.

Scaling Up the MVIS

At the end of 90 days, you should have established your broad innovation buckets, identified your strategic opportunity areas, assembled a team that has started on its first project, and created the shepherding mechanism to speed the team on its way. Once you have the MVIS in place and see signs that specific projects will bear fruit (which may occur within the first few months or may take longer, depending on circumstances), it’s time to consider next steps.
First, consider hardwiring the components of the MVIS that are working well into more-formal systems. Manila Water created a master plan of innovation efforts, which forecast the pace and scale of its investment activities and their financial impact over a multiyear period. CTS assigned individuals to oversee certain processes and created tracking tools to enable them to regularly monitor the portfolio of innovation projects. Though such efforts can feel like creeping bureaucracy, they’re part of the natural maturation of innovation as an organizational capability.
Second, consider creating specialized functions to carry out parts of the innovation process. A small organization might, for example, assign a single person to act as a “scout,” keeping abreast of market changes. A large one might establish a business development team that looks for opportunities to form partnerships and alliances to amplify new-growth efforts. Or it might form groups to conduct ethnographic market research or develop rapid prototyping techniques.
Finally, work on the MVIS should highlight some of the larger barriers to innovation inside an organization. These often reside within corporate budgeting, incentive, and strategic-planning systems, which, after all, are designed to further today’s business, not create tomorrow’s. Rewiring those systems or establishing robust parallels presents substantial challenges but is critical to scaling up and spreading innovation efforts.
A division of a massive financial services company. A leading pediatric hospital. A water utility in an emerging market. A 100-year-old nonprofit. The organizations we’ve highlighted here are in different industries, have different missions, and operate in different contexts. But they share a problem faced by countless organizations around the globe: How do we start to make the magic of innovation more systematic and strategic? It is a daunting challenge. We conclude with three pieces of advice:
  • Remember, the “S” in MVIS stands for system. You can’t pick and choose between the four elements described above. Do everything, or do nothing.
  • One area that absolutely cannot be shortchanged is personnel. If you have no one fully focused on new growth, you’ve decided not to focus on new growth.
  • How you treat failure is more important than how you reward success. Hiding or fearing failure spawns projects that never die and that suck up all your capacity for innovation.

Creating an MVIS won’t miraculously turn you into Pixar or Amazon, but it will help you make tangible progress in increasing the predictability and productivity of critical investments in future growth.

Zombie projects and how to kill them

source

“You will never find them,” said a senior leader in a multibillion-dollar IT company.

The “them” the leader was referring to were zombie projects: the nefarious enemies of well-intentioned innovation efforts around the globe. Zombies are projects that, for any number of reasons, fail to fulfill their promise and yet keep shuffling along, sucking up resources without any real hope of having a meaningful impact on the company’s strategy or revenue prospects.

We had suggested that at least one reason why the company was struggling to successfully commercialize innovative ideas was that zombies were draining its resources and clogging its pipeline. The leader was skeptical.

He thought we wouldn’t find any given the company’s highly rigorous planning process. Every year scores of people spent months reviewing recent performance and sanity-checking future projections. Every project went under the proverbial microscope. So how could a zombie project possibly exist?

A zombie project spawns in predictable ways. The project certainly makes sense when first sanctioned by leadership. Its financial projections, while always uncertain, look reasonable. Market assumptions seem plausible. The development timeline looks achievable.

But somewhere along the line, something happens. The technology doesn’t quite work as planned. A competitor does something unanticipated. A key partner decides not to participate. Customers react in an unexpected way.

Project team members know that what’s happened isn’t good, but it’s hard for them to acknowledge when a project has come off the rails. Psychologists have pointed out how we suffer from confirmation bias, paying more attention to the things we expect and ignoring the things we don’t. And even when we’re aware of setbacks, we’re prone to using the affect heuristic — when we believe in something, we play up good news and ignore bad news.

At some point the data do become overwhelming, and if you gave team members truth serum, they’d admit that the project will never contribute meaningfully to the company’s financial and strategic goals. But since in most companies reward systems carry strong penalties for failing to meet commitments, people hesitate to raise their hands and say “Our project is one of those.” It just looks smarter to find ways to stay alive.

We had spent enough time with our IT company to know how skillfully project leaders could subvert the disciplines of the budgeting process to keep their zombies shuffling along. One recipe for survival: project big revenue numbers five years in the future but ask for only modest investment in the near term. In the next budget cycle, repeat the process so that projected revenue always stays safely beyond the planning process’s two-year horizon. As long as the team successfully manages its costs, everything’s fine, since there’s essentially no penalty for perpetually projecting, but never hitting, long-term targets.

Every budgeting system has its quirks, and innovators in survival mode will skillfully find and exploit them. To fight against these challenges we proposed a “zombie amnesty” — a period during which people can come clean, put their projects up for consideration, and suffer no repercussions if a project is terminated. The critical point of the amnesty is not to lay people off to cut costs but rather to allow the company to invest in new growth by redeploying them to more promising projects.
When we evaluated three dozen efforts for this IT company using realistic projections of possible revenues, we found 20% of them were zombies that didn’t warrant continued investment. Shutting those projects down without penalty would free up enough funding to support two years of more strategic innovation activities.

In a December 2014 HBR article, we argued that these kinds of zombie amnesties are a vital component of a systematic approach to innovation. But they aren’t easy to pull off. Based on our work and the work of like-minded academics — most notably Rita Gunther McGrath of Columbia University (a certified zombie killer if ever one existed) — we’ve identified six keys to doing it successfully:
  1. Use simple, transparent, predetermined criteria. Shutting a project down can be very emotional. Setting and sharing a shortlist of criteria before the process begins helps participants view the process as rational. At the most basic level, we always ask three questions about an idea: Is there a real market need? Can we fulfill that need better than current and potential competitors? Can we meet our financial objectives? Whatever the criteria, remember they are guidelines, not rules. Final decisions will always require some degree of subjective judgment.
  2. Involve outsiders. Parents will attest to how hard it is to be objective about something you’ve played a part in conceiving. An uninvolved outsider — someone from a different division or from the outside entirely — can bring important impartiality to the process.
  3. Codify lessons learned along the way. McGrath teaches that any time a company innovates, two good things can happen. The idea is successfully commercialized (clearly good), or — even if it is not — you learn something that sets you up for future success. Hold action-after reviews to capture lessons learned and create a living database to store and share those lessons. As research shows that “knowledge gained from failures [is] often instrumental in achieving subsequent successes,” investing to capture and spread knowledge from your zombie projects maximizes the return on those investments.
  4. Expand the definition of success. Executives at large companies often fret about how to match the upside potential enjoyed by start-up entrepreneurs. They should spend far more time worrying about what happens to innovators that work on projects that don’t succeed commercially. After all, when taking well-thought-out risks carries the risk of punishment, it’s no surprise that people hesitate to take any risks. Any time you innovate, future success is unknown. Therefore, learning that an idea is not viable is a successful outcome, as long as those lessons are learned in a reasonably resource-efficient way. Pat team members on the back when they’ve given you that precious gift.
  5. Communicate widely. This might sound counterintuitive, but broadcasting commercial failures widely encourages future efforts, because innovation happens most naturally at companies that “dare to try.” That actually is the name of an award given by the Tata Group, India’s leading conglomerate. The award “recognises and rewards [the] most novel, daring, and seriously attempted ideas that did not achieve the desired results.” Shining a spotlight on these kinds of efforts naturally makes it safer for people to push the innovation boundaries. After all, if you don’t dare to try, how can you hope to succeed?
  6. Provide closure. This idea is ripped straight from McGrath’s excellent 2011 HBR article, “Failing by Design”: “Have a symbolic event—a wake, a play, a memorial—to give people closure.”
The Finnish mobile gaming company SuperCell, which was valued at $3 billion only three years after its founding, demonstrates the power of following these disciplines. At SuperCell, success is celebrated with beer, failure with champagne. Mistakes are addressed with brutal honesty, as when after a year of development and investment the company decided to scupper a multi-platform approach that fell short of its development targets. By decisively killing a potential zombie project and yet celebrating the good work of the team, SuperCell allowed its members to shift their focus to a better idea. In this case, they went on to develop the massively successful Clash of Clans game.


Almost every company has more resources than it realizes. Find and put the zombies down, reallocate resources to your most promising projects, and you will suddenly find your innovation efforts getting better and bigger faster.

Best Python Package layout

source

Packaging a python library

Sun 25 May 2014

Note
This is about packaging libraries, not applications.

All the advice here is implemented in a project template (with full support for C extensions): cookiecutter-pylibrary (introduction).
I think the packaging best practices should be revisited, there are lots of good tools now-days that are either unused or underused. It's generally a good thing to re-evaluate best practices all the time.
I assume here that your package is to be tested on multiple Python versions, with different combinations of dependency versions, settings etc.
And few principles that I like to follow when packaging:
  • If there's a tool that can help with testing use it. Don't waste time building a custom test runner if you can just use py.test or nose. They come with a large ecosystem of plugins that can improve your testing.
  • When possible, prevent issues early. This is mostly a matter of strictness and exhaustive testing. Design things to prevent common mistakes.
  • Collect all the coverage data. Record it. Identify regressions.
  • Test all the possible configurations.

The structure

This is fairly important, everything revolves around this. I prefer this sort of layout:
├─ src
│  └─ packagename
│     ├─ __init__.py
│     └─ ...
├─ tests
│  └─ ...
└─ setup.py
The src directory is a better approach because:
  • You get import parity. The current directory is implicitly included in sys.path; but not so when installing & importing from site-packages. Users will never have the same current working directory as you do.
    This constraint has beneficial implications in both testing and packaging:
    • You will be forced to test the installed code (e.g.: by installing in a virtualenv). This will ensure that the deployed code works (it's packaged correctly) - otherwise your tests will fail. Early. Before you can publish a broken distribution.
    • You will be forced to install the distribution. If you ever uploaded a distribution on PyPI with missing modules or broken dependencies it's because you didn't test the installation. Just beeing able to successfuly build the sdist doesn't guarantee it will actually install!
  • It prevents you from readily importing your code in the setup.py script. This is a bad practice because it will always blow up if importing the main package or module triggers additional imports for dependencies (which may not be available [5]). Best to not make it possible in the first place.
  • Simpler packaging code and manifest. It makes manifests very simple to write (e.g.: you package a Django app that has templates or static files). Also, zero fuss for large libraries that have multiple packages. Clear separation of code being packaged and code doing the packaging.
    Without src writting a MANIFEST.in is tricky [6]. If your manifest is broken your tests will fail. It's much easier with a src directory: just add graft src in MANIFEST.in.
    Publishing a broken package to PyPI is not fun.
  • Without src you get messy editable installs ("setup.py develop" or "pip install -e"). Having no separation (no src dir) will force setuptools to put your project's root on sys.path - with all the junk in it (e.g.: setup.py and other test or configuration scripts will unwittingly become importable).
  • There are better tools. You don't need to deal with installing packages just to run the tests anymore. Just use tox - it will install the package for you [2] automatically, zero fuss, zero friction.
  • Less chance for user mistakes - they will happen - assume nothing!
  • Less chance for tools to mixup code with non-code.
Another way to put it, flat is better than nested [*] - but not for data. A file-system is just data after all - and cohesive, well normalized data structures are desirable.
You'll notice that I don't include the tests in the installed packages. Because:
  • Module discovery tools will trip over your test modules. Strange things usually happen in test module. The help builtin does module discovery. E.g.:
    >>> help('modules')
    Please wait a moment while I gather a list of all available modules...
    
    __future__          antigravity         html                select
    ...
    
  • Tests usually require additional dependencies to run, so they aren't useful by their own - you can't run them directly.
  • Tests are concerned with development, not usage.
  • It's extremely unlikely that the user of the library will run the tests instead of the library's developer. E.g.: you don't run the tests for Django while testing your apps - Django is already tested.

Alternatives

You could use src-less layouts, few examples:
Tests in package Tests outside package
├─ packagename
│  ├─ __init__.py
│  ├─ ...
│  └─ tests
│     └─ ...
└─ setup.py
├─ packagename
│  ├─ __init__.py
│  └─ ...
├─ tests
│  └─ ...
└─ setup.py
These two layouts became popular because packaging had many problems few years ago, so it wasn't feasible to install the package just to test it. People still recommend them [4] even if it based on old and oudated assumptions.
Most projects use them incorectly, as all the test runners except Twisted's trial have incorrect defaults for the current working directory - you're going to test the wrong code if you don't test the installed code. trial does the right thing by changing the working directory to something temporary, but most projects don't use trial.

The setup script

Unfortunately with the current packaging tools, there are many pitfalls. The setup.py script should be as simple as possible:
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
from __future__ import absolute_import
from __future__ import print_function

import io
import re
from glob import glob
from os.path import basename
from os.path import dirname
from os.path import join
from os.path import splitext

from setuptools import find_packages
from setuptools import setup


def read(*names, **kwargs):
    return io.open(
        join(dirname(__file__), *names),
        encoding=kwargs.get('encoding', 'utf8')
    ).read()


setup(
    name='nameless',
    version='0.1.0',
    license='BSD 2-Clause License',
    description='An example package. Generated with https://github.com/ionelmc/cookiecutter-pylibrary',
    long_description='%s\n%s' % (
        re.compile('^.. start-badges.*^.. end-badges', re.M | re.S).sub('', read('README.rst')),
        re.sub(':[a-z]+:`~?(.*?)`', r'``\1``', read('CHANGELOG.rst'))
    ),
    author='Ionel Cristian Mărieș',
    author_email='contact@ionelmc.ro',
    url='https://github.com/ionelmc/python-nameless',
    packages=find_packages('src'),
    package_dir={'': 'src'},
    py_modules=[splitext(basename(path))[0] for path in glob('src/*.py')],
    include_package_data=True,
    zip_safe=False,
    classifiers=[
        # complete classifier list: http://pypi.python.org/pypi?%3Aaction=list_classifiers
        'Development Status :: 5 - Production/Stable',
        'Intended Audience :: Developers',
        'License :: OSI Approved :: Apache Software License',
        'Operating System :: Unix',
        'Operating System :: POSIX',
        'Operating System :: Microsoft :: Windows',
        'Programming Language :: Python',
        'Programming Language :: Python :: 2.7',
        'Programming Language :: Python :: 3',
        'Programming Language :: Python :: 3.3',
        'Programming Language :: Python :: 3.4',
        'Programming Language :: Python :: 3.5',
        'Programming Language :: Python :: 3.6',
        'Programming Language :: Python :: Implementation :: CPython',
        'Programming Language :: Python :: Implementation :: PyPy',
        # uncomment if you test on these interpreters:
        # 'Programming Language :: Python :: Implementation :: IronPython',
        # 'Programming Language :: Python :: Implementation :: Jython',
        # 'Programming Language :: Python :: Implementation :: Stackless',
        'Topic :: Utilities',
    ],
    keywords=[
        # eg: 'keyword1', 'keyword2', 'keyword3',
    ],
    install_requires=[
        'click',
        # eg: 'aspectlib==1.1.1', 'six>=1.7',
    ],
    extras_require={
        # eg:
        #   'rst': ['docutils>=0.11'],
        #   ':python_version=="2.6"': ['argparse'],
    },
    entry_points={
        'console_scripts': [
            'nameless = nameless.cli:main',
        ]
    },
)
What's special about this:
  • No exec or import trickery.
  • Includes everything from src: packages or root-level modules.
  • Explicit encodings.

Running the tests

Again, it seems people fancy the idea of running python setup.py test to run the package's tests. I think that's not worth doing - setup.py test is a failed experiment to replicate some of CPAN's test system. Python doesn't have a common test result protocol so it serves no purpose to have a common test command [1]. At least not for now - we'd need someone to build specifications and services that make this worthwhile, and champion them. I think it's important in general to recognize failure where there is and go back to the drawing board when that's necessary - there are absolutely no services or tools that use setup.py test command in a way that brings added value. Something is definitely wrong here.
I believe it's too late now for PyPI to do anything about it, Travis is already a solid, reliable, extremely flexible and free alternative. It integrates very well with Github - builds will be run automatically for each Pull Request.
To test locally tox is a very good way to run all the possible testing configurations (each configuration will be a tox environment). I like to organize the tests into a matrix with these additional environments:
  • check - check package metadata (e.g.: if the restructured text in your long description is valid)
  • clean - clean coverage
  • report - make coverage report for all the accumulated data
  • docs - build sphinx docs
I also like to have environments with and without coverage measurement and run them all the time. Race conditions are usually performance sensitive and you're unlikely to catch them if you run everything with coverage measurements.

The test matrix

Depending on dependencies you'll usually end up with a huge number of combinations of python versions, dependency versions and different settings. Generally people just hard-code everything in tox.ini or only in .travis.yml. They end up with incomplete local tests, or test configurations that run serially in Travis. I've tried that, didn't like it. I've tried duplicating the environments in both tox.ini and .travis.yml. Still didn't like it.
Note
This bootstrap.py technique is a bit outdated now. It still works fine but for simple matrices you can use a tox generative envlist (it was implemented after I wrote this blog post, unfortunately).

See python-nameless for an example using that.
As there were no readily usable alternatives to generate the configuration, I've implemented a generator script that uses templates to generate tox.ini and .travis.yml. This is way better, it's DRY, you can easily skip running tests on specific configurations (e.g.: skip Django 1.4 on Python 3) and there's less work to change things.
The essentials (full code):

setup.cfg

The generator script uses a configuration file (setup.cfg for convenience):
not_skip = __init__.py
skip = migrations

[matrix]
# This is the configuration for the `./bootstrap.py` script.
# It generates `.travis.yml`, `tox.ini` and `appveyor.yml`.
#
# Syntax: [alias:] value [!variable[glob]] [&variable[glob]]
#
# alias:
#  - is used to generate the tox environment
#  - it's optional
#  - if not present the alias will be computed from the `value`
# value:
#  - a value of "-" means empty
# !variable[glob]:
#  - exclude the combination of the current `value` with
#    any value matching the `glob` in `variable`
#  - can use as many you want
# &variable[glob]:
#  - only include the combination of the current `value`
#    when there's a value matching `glob` in `variable`
#  - can use as many you want

python_versions =
    2.7
    3.3
    3.4
    3.5
    3.6
    pypy

dependencies =
#    1.4: Django==1.4.16 !python_versions[3.*]
#    1.5: Django==1.5.11
#    1.6: Django==1.6.8
#    1.7: Django==1.7.1 !python_versions[2.6]
# Deps commented above are provided as examples. That's what you would use in a Django project.

coverage_flags =
    cover: true
    nocov: false

environment_variables =
    -

ci/bootstrap.py

This is the generator script. You run this whenever you want to regenerate the configuration:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import absolute_import, print_function, unicode_literals

import os
import sys
from os.path import abspath
from os.path import dirname
from os.path import exists
from os.path import join


if __name__ == "__main__":
    base_path = dirname(dirname(abspath(__file__)))
    print("Project path: {0}".format(base_path))
    env_path = join(base_path, ".tox", "bootstrap")
    if sys.platform == "win32":
        bin_path = join(env_path, "Scripts")
    else:
        bin_path = join(env_path, "bin")
    if not exists(env_path):
        import subprocess

        print("Making bootstrap env in: {0} ...".format(env_path))
        try:
            subprocess.check_call(["virtualenv", env_path])
        except subprocess.CalledProcessError:
            subprocess.check_call([sys.executable, "-m", "virtualenv", env_path])
        print("Installing `jinja2` and `matrix` into bootstrap environment...")
        subprocess.check_call([join(bin_path, "pip"), "install", "jinja2", "matrix"])
    activate = join(bin_path, "activate_this.py")
    # noinspection PyCompatibility
    exec(compile(open(activate, "rb").read(), activate, "exec"), dict(__file__=activate))

    import jinja2

    import matrix

    jinja = jinja2.Environment(
        loader=jinja2.FileSystemLoader(join(base_path, "ci", "templates")),
        trim_blocks=True,
        lstrip_blocks=True,
        keep_trailing_newline=True
    )

    tox_environments = {}
    for (alias, conf) in matrix.from_file(join(base_path, "setup.cfg")).items():
        python = conf["python_versions"]
        deps = conf["dependencies"]
        tox_environments[alias] = {
            "python": "python" + python if "py" not in python else python,
            "deps": deps.split(),
        }
        if "coverage_flags" in conf:
            cover = {"false": False, "true": True}[conf["coverage_flags"].lower()]
            tox_environments[alias].update(cover=cover)
        if "environment_variables" in conf:
            env_vars = conf["environment_variables"]
            tox_environments[alias].update(env_vars=env_vars.split())

    for name in os.listdir(join("ci", "templates")):
        with open(join(base_path, name), "w") as fh:
            fh.write(jinja.get_template(name).render(tox_environments=tox_environments))
        print("Wrote {}".format(name))
    print("DONE.")

ci/templates/.travis.yml

This has some goodies in it: the very useful libSegFault.so trick.
It basically just runs tox.
language: python
sudo: false
cache: pip
env:
  global:
    - LD_PRELOAD=/lib/x86_64-linux-gnu/libSegFault.so
    - SEGFAULT_SIGNALS=all
  matrix:
    - TOXENV=check
    - TOXENV=docs
matrix:
  include:
{%- for env, config in tox_environments|dictsort %}{{ '' }}
    - python: '{{ '{0[0]}-5.4'.format(env.split('-')) if env.startswith('pypy') else env.split('-')[0] }}'
      env:
        - TOXENV={{ env }}{% if config.cover %},report,coveralls,codecov{% endif -%}
{% endfor %}

before_install:
  - python --version
  - uname -a
  - lsb_release -a
install:
  - pip install tox
  - virtualenv --version
  - easy_install --version
  - pip --version
  - tox --version
script:
  - tox -v
after_failure:
  - more .tox/log/* | cat
  - more .tox/*/log/* | cat
notifications:
  email:
    on_success: never
    on_failure: always

ci/templates/tox.ini

[tox]
envlist =
    clean,
    check,
{% for env in tox_environments|sort %}
    {{ env }},
{% endfor %}
    report,
    docs

[testenv]
basepython =
    {docs,spell}: {env:TOXPYTHON:python2.7}
    {bootstrap,clean,check,report,extension-coveralls,coveralls,codecov}: {env:TOXPYTHON:python3}
setenv =
    PYTHONPATH={toxinidir}/tests
    PYTHONUNBUFFERED=yes
passenv =
    *
deps =
    pytest
    pytest-travis-fold
commands =
    {posargs:py.test -vv --ignore=src}

[testenv:spell]
setenv =
    SPELLCHECK=1
commands =
    sphinx-build -b spelling docs dist/docs
skip_install = true
usedevelop = false
deps =
    -r{toxinidir}/docs/requirements.txt
    sphinxcontrib-spelling
    pyenchant

[testenv:docs]
deps =
    -r{toxinidir}/docs/requirements.txt
commands =
    sphinx-build {posargs:-E} -b html docs dist/docs
    sphinx-build -b linkcheck docs dist/docs

[testenv:bootstrap]
deps =
    jinja2
    matrix
skip_install = true
usedevelop = false
commands =
    python ci/bootstrap.py
[testenv:check]
deps =
    docutils
    check-manifest
    flake8
    readme-renderer
    pygments
    isort
skip_install = true
usedevelop = false
commands =
    python setup.py check --strict --metadata --restructuredtext
    check-manifest {toxinidir}
    flake8 src tests setup.py
    isort --verbose --check-only --diff --recursive src tests setup.py

[testenv:coveralls]
deps =
    coveralls
skip_install = true
usedevelop = false
commands =
    coveralls []

[testenv:codecov]
deps =
    codecov
skip_install = true
usedevelop = false
commands =
    coverage xml --ignore-errors
    codecov []


[testenv:report]
deps = coverage
skip_install = true
usedevelop = false
commands =
    coverage combine --append
    coverage report
    coverage html

[testenv:clean]
commands = coverage erase
skip_install = true
usedevelop = false
deps = coverage

{% for env, config in tox_environments|dictsort %}
[testenv:{{ env }}]
basepython = {env:TOXPYTHON:{{ config.python }}}
{% if config.cover or config.env_vars %}
setenv =
    {[testenv]setenv}
{% endif %}
{% for var in config.env_vars %}
    {{ var }}
{% endfor %}
{% if config.cover %}
usedevelop = true
commands =
    {posargs:py.test --cov --cov-report=term-missing -vv}
{% endif %}
{% if config.cover or config.deps %}
deps =
    {[testenv]deps}
{% endif %}
{% if config.cover %}
    pytest-cov
{% endif %}
{% for dep in config.deps %}
    {{ dep }}
{% endfor %}

{% endfor %}

ci/templates/appveyor.ini

For Windows-friendly projects:
version: '{branch}-{build}'
build: off
cache:
  - '%LOCALAPPDATA%\pip\Cache'
environment:
  global:
    WITH_COMPILER: 'cmd /E:ON /V:ON /C .\ci\appveyor-with-compiler.cmd'
  matrix:
    - TOXENV: check
      TOXPYTHON: C:\Python27\python.exe
      PYTHON_HOME: C:\Python27
      PYTHON_VERSION: '2.7'
      PYTHON_ARCH: '32'
{% for env, config in tox_environments|dictsort %}{{ '' }}{% if config.python.startswith('python') %}
    - TOXENV: '{{ env }}{% if config.cover %},report,codecov{% endif %}'
      TOXPYTHON: C:\{{ config.python.replace('.', '').capitalize() }}\python.exe
      PYTHON_HOME: C:\{{ config.python.replace('.', '').capitalize() }}
      PYTHON_VERSION: '{{ config.python[-3:] }}'
      PYTHON_ARCH: '32'
    - TOXENV: '{{ env }}{% if config.cover %},report,codecov{% endif %}'
      TOXPYTHON: C:\{{ config.python.replace('.', '').capitalize() }}-x64\python.exe
      {%- if config.python != 'python3.5' %}

      WINDOWS_SDK_VERSION: v7.{{ '1' if config.python[-3] == '3' else '0' }}
      {%- endif %}

      PYTHON_HOME: C:\{{ config.python.replace('.', '').capitalize() }}-x64
      PYTHON_VERSION: '{{ config.python[-3:] }}'
      PYTHON_ARCH: '64'

{% endif %}{% endfor %}
init:
  - ps: echo $env:TOXENV
  - ps: ls C:\Python*
install:
  - python -u ci\appveyor-bootstrap.py
  - '%PYTHON_HOME%\Scripts\virtualenv --version'
  - '%PYTHON_HOME%\Scripts\easy_install --version'
  - '%PYTHON_HOME%\Scripts\pip --version'
  - '%PYTHON_HOME%\Scripts\tox --version'
test_script:
  - '%WITH_COMPILER% %PYTHON_HOME%\Scripts\tox'

on_failure:
  - ps: dir "env:"
  - ps: get-content .tox\*\log\*
artifacts:
  - path: dist\*

### To enable remote debugging uncomment this (also, see: http://www.appveyor.com/docs/how-to/rdp-to-build-worker):
# on_finish:
#   - ps: $blockRdp = $true; iex ((new-object net.webclient).DownloadString('https://raw.githubusercontent.com/appveyor/ci/master/scripts/enable-rdp.ps1'))
If you've been patient enough to read through that you'll notice:
  • The Travis configuration uses tox for each item in the matrix. This makes testing in Travis consistent with testing locally.
  • The environment order for tox is clean, check, 2.6-1.3, 2.6-1.4, ..., report.
  • The environments with coverage measurement run the code without installing (usedevelop = true) so that coverage can combine all the measurements at the end.
  • The environments without coverage will sdist and install into virtualenv (tox's default behavior [2]) so that packaging issues are caught early.
  • The report environment combines all the runs at the end into a single report.
Having the complete list of environments in tox.ini is a huge advantage:
  • You run everything in parallel locally (if your tests don't need strict isolation) with detox. And you can still run everything in parallel if you want to use drone.io instead of Travis.
  • You can measure cummulated coverage for everything (merge the coverage measurements for all the environments into a single one) locally.

Test coverage

There's Coveralls - a nice way to track coverage over time and over multiple builds. It will automatically add comments on Github Pull Request about changes in coverage.

TL;DR

  • Put code in src.
  • Use tox and detox.
  • Test both with coverage measurements and without.
  • Use a generator script for tox.ini and .travis.ini.
  • Run the tests in Travis with tox to keep things consistent with local testing.
Too complicated? Just use a python package template.
Not convincing enough? Read Hynek's post about the src layout.