Wednesday, April 27, 2016

How To Sound Smart At Your Next Team Meeting by Matthew Jones

source

Occam's Razor

This widely-known adage dates to a philosopher and friar from the fourteenth century named William of Ockham. Occam's Razor is often stated as:
"Among competing hypotheses, the one with the fewest assumptions should be selected."
It's no surprise that the whole reason we can recall an adage from 600+ years ago is that it works so well. Occam's Razor is so basic, so fundamental, that it should be the first thing we think of when deciding between two competing theories. I'd even go so far as to argue that in the vast majority of cases, simpler is better.

Hanlon's Razor

Sometimes I feel like users are intentionally trying to piss me off. They push buttons they weren't supposed to, found flaws that shouldn't have been visible to them (since they weren't to me), and generally make big swaths of my life more difficult than it would otherwise be.
I try to remember, though, that the vast majority of actions done by people which may seem malicious are not intentionally so. Rather, it's because they don't know any better. This is the crux of an adage known as Hanlon's Razor, which states:
"Never attribute to malice what can be adequately explained by stupidity."
Don't assume people are malicious; assume they are ignorant, and then help them overcome that ignorance. Most people want to learn, not be mean for the fun of it.

The Pareto Principle

The last Basic Law of Software Development is the Pareto Principle. Romanian-American engineer Joseph M Juran formulated this adage, which he named after an idea proposed by Italian economist and thinker Vilfredo Pareto. The Pareto Principle is usually worded as:
"80% of the effects stem from 20% of the causes."
Have you even been in a situation where your app currently has hundreds of errors, but when you track down one of the problems, a disproportionate amount of said errors just up and vanish? If you have (and you probably have), then you've experienced the Pareto Principle in action. Many of the problems we see, whether coding, dealing with customers, or just living our lives, share a small set of common root issues that, if solved or alleviated, can cause most or all of the problems we see to disappear.
In short, the fastest way to solve many problems at once is the find and fix their common root cause.

Dunning-Kruger Effect

Researchers David Dunning and Justin Kruger, conducting an experiment in 1999, observed a phenomenon that's come to be known as the Dunning-Kruger effect:
"Unskilled persons tend to mistakenly assess their own abilities as being much more competent than they actually are."
What follows from this is a bias in which people who aren't very good at their job think they are good at it, but aren't skilled enough to recognize that they aren't. Of all the laws in this list, the Dunning-Kruger effect may be the most powerful, if for no other reason than it has been actively investigated in a formal setting by a real-life research team.

Linus's Law

Author and developer Eric S. Raymond developed this law, which he named after Linus Torvalds. Linus's Law states:
"Given enough eyeballs, all bugs are shallow."
In other words, if you can't find the problem, get someone else to help. This is why concepts like pair programming work well in certain contexts; after all, more often than not, the bug is in your code.

Robustness Principle (AKA Postel's Law)

One of the fundamental ideas in software development, particularly fields such as API design, can be concisely expressed by the Robustness Principle:
"Be conservative in what you do, be liberal in what you accept from others."
This principle is also called Postel's Law for Jon Postel, the Internet pioneer who originally wrote it down as part of RFC 760. It's worth remembering, if for no other reason than an gentle reminder that often the best code is no code at all.

Eagleson's Law

Ever been away from a project for a long time, then returned to it and wondered "what idiot wrote this crap?" only to find out that the idiot was you?
Eagleson's Law describes this situation quite accurately:
"Any code of your own that you haven't looked at for six or more months might as well have been written by someone else."
Remember that the next time you're rejoining a project you've been away from for months. The code is no longer your code; it is someone else's that you've now been tasked with improving.

Peter Principle

One of the fundamental laws that can apply to managers (of any field, not just software) is the Peter Principle, formulated by Canadian educator Laurence J Peter:
"The selection of a candidate for a position is based on the candidate's performance in their current role, rather than on abilities relevant to the intended role."
The Peter Principle is often sarcastically reduced to "Managers rise to their level of incompetence." The idea of this principle looks like this:
A chart showing the advancement of a candidate to higher and higher levels of management, until reaching a point at which s/he is no longer qualified to obtain via skill.
The problem revealed by the Peter Principle is that workers tend to get evaluated on how well they are currently doing, and their superiors assume that those workers would also be good at a different role, even though their current role and their intended role may not be the same or even similar. Eventually, such promotions place unqualified candidates in high positions of power, and in particularly bad cases you can end up with pointy-haired bosses at every step of an organization's hierarchy.

Dilbert Principle

Speaking of pointy-haired bosses, cartoonist Scott Adams (who publishes the comic strip Dilbert) proposed an negative variation of the Peter Principle which he named the Dilbert Principle. The Peter Principle assumes that the promoted workers are in fact competent at their current position; this is why they got promoted in the first place. By contrast, the Dilbert Principle assumes that the least competent people get promoted the fastest. The Dilbert Principle is usually stated like this:
"Incompetent workers will be promoted above competent workers to managerial positions, thus removing them from the actual work and minimizing the damage they can do."
This can be phrased another way: "Companies are hesitant to fire people but also want to not let them hurt their business, so companies promote incompetent workers into the place where they can do the least harm: management."

Hofstadter's Law

Ever noticed that doing something always takes longer than you think? So did Douglas Hofstadter, who wrote a seminal book on cognitive science and self-reference called Godel, Escher, Bach: An Eternal Golden Braid. In that book, he proposed Hofstadter's Law:
"It always takes longer than you expect, even when you take into account Hofstadter's Law."
Always is the key word: nothing ever goes as planned, so you're better off putting extra time in your estimates to cover some thing that will go wrong, because it unfailingly does.

The 90-90 Rule

Because something always goes wrong, and because people are notoriously bad at estimating their own skill level, Tom Cargill, an engineer at Bell Labs in the 1980's, proposed something that eventually came to be called the 90-90 rule:
"The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time."
Perhaps this explains why so many software projects end up over budget and short on features.

Parkinson's Law

What is possibly the most astute observation that can be applied to the art of estimation comes from British naval historian C. N. Parkinson. He jokingly proposed an adage called Parkinson's Law, which was originally understood to be:
"Work expands so as to fill the time available for its completion."
Remember this next time you pad your estimates.

Sayre's Law

Economist and professor Charles Issawi proposed an idea that came to be known as Sayre's Law, named after a fellow professor at Columbia University. Issawi's formulation of this law looks like this:
"In any dispute the intensity of feeling is inversely proportional to the value of the issues at stake."
In short, that the less significant something is, the more passionately people will argue about it.

Parkinson's Law of Triviality (AKA Bikeshedding)

Sayre's Law segues directly into another law that applies to meetings, and here we again encounter the ideas of C.N. Parkinson. Parkinson's Law of Triviality states:
"The time spent on any agenda item will be in inverse proportion to the sum of money involved."
Parkinson imagined a situation in which a committee of people were tasked with designing a nuclear reactor. Said committee then spends a disproportionate amount of time designing the reactor's bikeshed, since any common person will have enough life experience to understand what a bikeshed should look like. Clearly the "core" functions of the reactor are more important, but they are so complex that no average person will understand all of them intimately. Consequently, time (and opinions) are spent on ideas that everyone can comprehend, but which are clearly more trivial.

Law of Argumentative Comprehension

The last law is one
I totally made upI use to shorthand both Sayre's Law and Parkinson's Law of Triviality. I call it the Law of Argumentative Comprehension:

"The more people understand something, the more willing they are to argue about it, and the more vigorously they will do so."

Thursday, April 21, 2016

Design path via testing

http://www.satisfice.com/blog/archives/856
https://vimeo.com/80533536
http://pyvideo.org/video/1670/boundaries

Monday, April 18, 2016

How People Learn to Become Resilient By Maria Konnikova

source

Norman Garmezy, a developmental psychologist and clinician at the University of Minnesota, met thousands of children in his four decades of research. But one boy in particular stuck with him. He was nine years old, with an alcoholic mother and an absent father. Each day, he would arrive at school with the exact same sandwich: two slices of bread with nothing in between. At home, there was no other food available, and no one to make any. Even so, Garmezy would later recall, the boy wanted to make sure that “no one would feel pity for him and no one would know the ineptitude of his mother.” Each day, without fail, he would walk in with a smile on his face and a “bread sandwich” tucked into his bag.
The boy with the bread sandwich was part of a special group of children. He belonged to a cohort of kids—the first of many—whom Garmezy would go on to identify as succeeding, even excelling, despite incredibly difficult circumstances. These were the children who exhibited a trait Garmezy would later identify as “resilience.” (He is widely credited with being the first to study the concept in an experimental setting.) Over many years, Garmezy would visit schools across the country, focussing on those in economically depressed areas, and follow a standard protocol. He would set up meetings with the principal, along with a school social worker or nurse, and pose the same question: Were there any children whose backgrounds had initially raised red flags—kids who seemed likely to become problem kids—who had instead become, surprisingly, a source of pride? “What I was saying was, ‘Can you identify stressed children who are making it here in your school?’ “ Garmezy said, in a 1999 interview. “There would be a long pause after my inquiry before the answer came. If I had said, ‘Do you have kids in this school who seem to be troubled?,’ there wouldn’t have been a moment’s delay. But to be asked about children who were adaptive and good citizens in the school and making it even though they had come out of very disturbed backgrounds—that was a new sort of inquiry. That’s the way we began.”
Environmental threats can come in various guises. Some are the result of low socioeconomic status and challenging home conditions. (Those are the threats studied in Garmezy’s work.) Often, such threatsparents with psychological or other problems; exposure to violence or poor treatment; being a child of problematic divorce—are chronic. Other threats are acute: experiencing or witnessing a traumatic violent encounter, for example, or being in an accident. What matters is the intensity and the duration of the stressor. In the case of acute stressors, the intensity is usually high. The stress resulting from chronic adversity, Garmezy wrote, might be lower—but it “exerts repeated and cumulative impact on resources and adaptation and persists for many months and typically considerably longer.”
Prior to Garmezy’s work on resilience, most research on trauma and negative life events had a reverse focus. Instead of looking at areas of strength, it looked at areas of vulnerability, investigating the experiences that make people susceptible to poor life outcomes (or that lead kids to be “troubled,” as Garmezy put it). Garmezy’s work opened the door to the study of protective factors: the elements of an individual’s background or personality that could enable success despite the challenges they faced. Garmezy retired from research before reaching any definitive conclusions—his career was cut short by early-onset Alzheimer’s—but his students and followers were able to identify elements that fell into two groups: individual, psychological factors and external, environmental factors, or disposition on the one hand and luck on the other.
In 1989 a developmental psychologist named Emmy Werner published the results of a thirty-two-year longitudinal project. She had followed a group of six hundred and ninety-eight children, in Kauai, Hawaii, from before birth through their third decade of life. Along the way, she’d monitored them for any exposure to stress: maternal stress in utero, poverty, problems in the family, and so on. Two-thirds of the children came from backgrounds that were, essentially, stable, successful, and happy; the other third qualified as “at risk.” Like Garmezy, she soon discovered that not all of the at-risk children reacted to stress in the same way. Two-thirds of them “developed serious learning or behavior problems by the age of ten, or had delinquency records, mental health problems, or teen-age pregnancies by the age of eighteen.” But the remaining third developed into “competent, confident, and caring young adults.” They had attained academic, domestic, and social success—and they were always ready to capitalize on new opportunities that arose.

What was it that set the resilient children apart? Because the individuals in her sample had been followed and tested consistently for three decades, Werner had a trove of data at her disposal. She found that several elements predicted resilience. Some elements had to do with luck: a resilient child might have a strong bond with a supportive caregiver, parent, teacher, or other mentor-like figure. But another, quite large set of elements was psychological, and had to do with how the children responded to the environment. From a young age, resilient children tended to “meet the world on their own terms.” They were autonomous and independent, would seek out new experiences, and had a “positive social orientation.” “Though not especially gifted, these children used whatever skills they had effectively,” Werner wrote. Perhaps most importantly, the resilient children had what psychologists call an “internal locus of control”: they believed that they, and not their circumstances, affected their achievements. The resilient children saw themselves as the orchestrators of their own fates. In fact, on a scale that measured locus of control, they scored more than two standard deviations away from the standardization group.
Werner also discovered that resilience could change over time. Some resilient children were especially unlucky: they experienced multiple strong stressors at vulnerable points and their resilience evaporated. Resilience, she explained, is like a constant calculation: Which side of the equation weighs more, the resilience or the stressors? The stressors can become so intense that resilience is overwhelmed. Most people, in short, have a breaking point. On the flip side, some people who weren’t resilient when they were little somehow learned the skills of resilience. They were able to overcome adversity later in life and went on to flourish as much as those who’d been resilient the whole way through. This, of course, raises the question of how resilience might be learned.
George Bonanno is a clinical psychologist at Columbia University’s Teachers College; he heads the Loss, Trauma, and Emotion Lab and has been studying resilience for nearly twenty-five years. Garmezy, Werner, and others have shown that some people are far better than others at dealing with adversity; Bonanno has been trying to figure out where that variation might come from. Bonanno’s theory of resilience starts with an observation: all of us possess the same fundamental stress-response system, which has evolved over millions of years and which we share with other animals. The vast majority of people are pretty good at using that system to deal with stress. When it comes to resilience, the question is: Why do some people use the system so much more frequently or effectively than others?
One of the central elements of resilience, Bonanno has found, is perception: Do you conceptualize an event as traumatic, or as an opportunity to learn and grow? “Events are not traumatic until we experience them as traumatic,” Bonanno told me, in December. “To call something a ‘traumatic event’ belies that fact.” He has coined a different term: PTE, or potentially traumatic event, which he argues is more accurate. The theory is straightforward. Every frightening event, no matter how negative it might seem from the sidelines, has the potential to be traumatic or not to the person experiencing it. (Bonanno focusses on acute negative events, where we may be seriously harmed; others who study resilience, including Garmezy and Werner, look more broadly.) Take something as terrible as the surprising death of a close friend: you might be sad, but if you can find a way to construe that event as filled with meaning—perhaps it leads to greater awareness of a certain disease, say, or to closer ties with the communitythen it may not be seen as a trauma. (Indeed, Werner found that resilient individuals were far more likely to report having sources of spiritual and religious support than those who weren’t.) The experience isn’t inherent in the event; it resides in the event’s psychological construal.
It’s for this reason, Bonanno told me, that “stressful” or “traumatic” events in and of themselves don’t have much predictive power when it comes to life outcomes. “The prospective epidemiological data shows that exposure to potentially traumatic events does not predict later functioning,” he said. “It’s only predictive if there’s a negative response.” In other words, living through adversity, be it endemic to your environment or an acute negative event, doesn’t guarantee that you’ll suffer going forward. What matters is whether that adversity becomes traumatizing.
The good news is that positive construal can be taught. “We can make ourselves more or less vulnerable by how we think about things,” Bonanno said. In research at Columbia, the neuroscientist Kevin Ochsner has shown that teaching people to think of stimuli in different ways—to reframe them in positive terms when the initial response is negative, or in a less emotional way when the initial response is emotionally “hot”—changes how they experience and react to the stimulus. You can train people to better regulate their emotions, and the training seems to have lasting effects.
Similar work has been done with explanatory styles—the techniques we use to explain events. I’ve written before about the research of Martin Seligman, the University of Pennsylvania psychologist who pioneered much of the field of positive psychology: Seligman found that training people to change their explanatory styles from internal to external (“Bad events aren’t my fault”), from global to specific (“This is one narrow thing rather than a massive indication that something is wrong with my life”), and from permanent to impermanent (“I can change the situation, rather than assuming it’s fixed”) made them more psychologically successful and less prone to depression. The same goes for locus of control: not only is a more internal locus tied to perceiving less stress and performing better but changing your locus from external to internal leads to positive changes in both psychological well-being and objective work performance. The cognitive skills that underpin resilience, then, seem like they can indeed be learned over time, creating resilience where there was none.
Unfortunately, the opposite may also be true. “We can become less resilient, or less likely to be resilient,” Bonanno says. “We can create or exaggerate stressors very easily in our own minds. That’s the danger of the human condition.” Human beings are capable of worry and rumination: we can take a minor thing, blow it up in our heads, run through it over and over, and drive ourselves crazy until we feel like that minor thing is the biggest thing that ever happened. In a sense, it’s a self-fulfilling prophecy. Frame adversity as a challenge, and you become more flexible and able to deal with it, move on, learn from it, and grow. Focus on it, frame it as a threat, and a potentially traumatic event becomes an enduring problem; you become more inflexible, and more likely to be negatively affected.

In December the New York Times Magazine published an essay called “The Profound Emptiness of ‘Resilience.’ “ It pointed out that the word is now used everywhere, often in ways that drain it of meaning and link it to vague concepts like “character.” But resilience doesn’t have to be an empty or vague concept. In fact, decades of research have revealed a lot about how it works. This research shows that resilience is, ultimately, a set of skills that can be taught. In recent years, we’ve taken to using the term sloppily—but our sloppy usage doesn’t mean that it hasn’t been usefully and precisely defined. It’s time we invest the time and energy to understand what “resilience” really means.

Why there is no Hitchhiker’s Guide to Mathematics for Programmers By Jeremy Kun

source

Do you really want to get better at mathematics?

Remember when you first learned how to program? I do. I spent two years experimenting with Java programs on my own in high school. Those two years collectively contain the worst and most embarrassing code I have ever written. My programs absolutely reeked of programming no-nos. Hundred-line functions and even thousand-line classes, magic numbers, unreachable blocks of code, ridiculous code comments, a complete disregard for sensible object orientation, negligence of nearly all logic, and type-coercion that would make your skin crawl. I committed every naive mistake in the book, and for all my obvious shortcomings I considered myself a hot-shot programmer! At least I was learning a lot, and I was a hot-shot programmer in a crowd of high-school students interested in game programming.
Even after my first exposure and my commitment to get a programming degree in college, it was another year before I knew what a stack frame or a register was, two before I was anywhere near competent with a terminal, three before I learned to appreciate functional programming, and to this day I still have an irrational fear of networking and systems programming (the first time I manually edited the call stack I couldn’t stop shivering with apprehension and disgust at what I was doing).
I just made it so this function returns to a *different* place than where it was called from.
I just made this function call return to a *different* place than where it was called from.
In a class on C++ programming I was programming a Checkers game, and my task at the moment was to generate a list of all possible jump-moves that could be made on a given board. This naturally involved a depth-first search and a couple of recursive function calls, and once I had something I was pleased with, I compiled it and ran it on my first non-trivial example. Low and behold (even having followed test-driven development!), I was hit hard in the face by a segmentation fault. It took hundreds of test cases and more than twenty hours of confusion before I found the error: I was passing a reference when I should have been passing a pointer. This was not a bug in syntax or semantics (I understood pointers and references well enough) but a design error. And the aggravating part, as most programmers know, was that the fix required the change of about 4 characters. Twenty hours of work for four characters! Once I begrudgingly verified it worked (of course it worked, it was so obvious in hindsight), I promptly took the rest of the day off to play Starcraft.
Of course, as every code-savvy reader will agree, all of this drama is part of the process of becoming and strong programmer. One must study the topics incrementally, make plentiful mistakes and learn from them, and spend uncountably many hours in a state of stuporous befuddlement before one can be considered an experienced coder. This gives rise to all sorts of programmer culture, unix jokes, and reverence for the masters of C that make the programming community so lovely to be a part of. It’s like a secret club where you know all the handshakes. And should you forget one, a crafty use of awk and sed will suffice.
"Semicolons of Fury" was the name of my programming team in the ACM collegiate programming contest. We placed Cal Poly third in the Southern California Regionals.
“Semicolons of Fury” was the name of my programming team in the ACM collegiate programming contest. We placed Cal Poly third in the Southern California Regionals, and in my opinion our success was due in large part to the dynamics of our team. I (center, in blue) have since gotten a more stylish haircut.
Now imagine someone comes along and says,
“I’m really interested in learning to code, but I don’t plan to write any programs and I absolutely abhor tracing program execution. I just want to use applications that others have written, like Chrome and iTunes.”
You would laugh at them! And the first thing that would pass through your mind is either, “This person would give up programming after the first twenty minutes,” or “I would be doing the world a favor by preventing this person from ever writing a program. This person belongs in some other profession.” This lies in stark opposition to the common chorus that everyone should learn programming. After all, it’s a constructive way to think about problem solving and a highly employable skill. In today’s increasingly technological world, it literally pays to know your computer better than a web browser. (Ironically, I’m writing this on my Chromebook, but in my defense it has a terminal with ssh. Perhaps more ironically, all of my real work is done with paper and pencil.)
Unfortunately this sentiment is mirrored among most programmers who claim to be interested in mathematics. Mathematics is fascinating and useful and doing it makes you smarter and better at problem solving. But a lot of programmers think they want to do mathematics, and they either don’t know what “doing mathematics” means, or they don’t really mean they want to do mathematics. The appropriate translation of the above quote for mathematics is:
“Mathematics is useful and I want to be better at it, but I won’t write any original proofs and I absolutely abhor reading other people’s proofs. I just want to use the theorems others have proved, like Fermat’s Last Theorem and the undecidability of the Halting Problem.”
Of course no non-mathematician is really going to understand the current proof of Fermat’s Last Theorem, just as no fledgling programmer is going to attempt to write a (quality) web browser. The point is that the sentiment is in the wrong place. Mathematics is cousin to programming in terms of the learning curve, obscure culture, and the amount of time one spends confused. And mathematics is as much about writing proofs as software development is about writing programs (it’s not everything, but without it you can’t do anything). Honestly, it sounds ridiculously obvious to say it directly like this, but the fact remains that people feel like they can understand the content of mathematics without being able to write or read proofs.
I want to devote the rest of this post to exploring some of the reasons why this misconception exists. My main argument is that the reasons have to do more with the culture of mathematics than the actual difficulty of the subject. Unfortunately as of the time of this writing I don’t have a proposed “solution.” And all I can claim is a problem is that programmers can have mistaken views of what mathematics involves. I don’t propose a way to make mathematics easier for programmers, although I do try to make the content on my blog as clear as possible (within reason). I honestly do believe that the struggle and confusion builds mathematical character, just as the arduous bug-hunt builds programming character. If you want to be good at mathematics, there is no other way.
All I want to do with this article is to detail why mathematics can be so hard for beginners, to explain a few of the secret handshakes, and hopefully to bring an outsider a step closer to becoming an insider. And I want to stress that this is not a call for all programmers to learn mathematics. Far from it! I just happen to notice that, for good reason, the proportion of programmers who are interested in mathematics is larger than in most professions. And as a member of both communities, I want to shed light on why mathematics can be difficult for an otherwise smart and motivated software engineer.
So read on, and welcome to the community.

Travelling far and wide

Perhaps one of the most prominent objections to devoting a lot of time to mathematics is that it can be years before you ever apply mathematics to writing programs. On one hand, this is an extremely valid concern. If you love writing programs and designing software, then mathematics is nothing more than a tool to help you write better programs.
But on the other hand, the very nature of mathematics is what makes it so applicable, and the only way to experience nature is to ditch the city entirely. Indeed, I provide an extended example of this in my journalesque post on introducing graph theory to high school students: the point of the whole exercise is to filter out the worldly details and distill the problem into a pristine mathematical form. Only then can we see its beauty and wide applicability.
Here is a more concrete example. Suppose you were trying to encrypt the contents of a message so that nobody could read it even if they intercepted the message in transit. Your first ideas would doubtlessly be the same as those of our civilization’s past: substitution ciphers, Vigenere ciphers, the Enigma machine, etc. Regardless of what method you come up with, your first thought would most certainly not be, “prime numbers so big they’ll make your pants fall down.” Of course, the majority of encryption methods today rely on very deep facts (or rather, conjectures) about prime numbers, elliptic curves, and other mathematical objects (“group presentations so complicated they’ll orient your Mobius band,” anyone?). But it took hundreds of years of number theory to get there, and countless deviations into other fields and dead-ends. It’s not that the methods themselves are particularly complicated, but the way they’re often presented (and this is unavoidable if you’re interested in new mathematical breakthroughs) is in the form of classical mathematical literature.
Of course there are other examples much closer to contemporary fashionable programming techniques. One such example is boosting. While we have yet to investigate boosting on this blog [update: yes we have], the basic idea is that one can combine a bunch of algorithms which perform just barely better than 50% accuracy, and collectively they will be arbitrarily close to perfect. In a field dominated by practical applications, this result is purely the product of mathematical analysis.
And of course boosting in turn relies on the mathematics of probability theory, which in turn relies on set theory and measure theory, which in turn relies on real analysis, and so on. One could get lost for a lifetime in this mathematical landscape! And indeed, the best way to get a good view of it all is to start at the bottom. To learn mathematics from scratch. The working programmer simply doesn’t have time for that.

What is it really, that people have such a hard time learning?

Most of the complaints about mathematics come understandably from notation and abstraction. And while I’ll have more to say on that below, I’m fairly certain that the main obstacle is a familiarity with the basic methods of proof.
While methods of proof are semantical by nature, in practice they form a scaffolding for all of mathematics, and as such one could better characterize them as syntactical. I’m talking, of course, about the four basics: direct implication, proof by contradiction, contrapositive, and induction. These are the loops, if statements, pointers, and structs of rigorous argument, and there is simply no way to understand the mathematics without a native fluency in this language.

The “Math Major Sloth” is fluent. Why aren’t you?
So much of mathematics is built up by chaining together a multitude of absolutely trivial statements which are amendable to proof by the basic four. I’m not kidding when I say they are absolutely trivial. A professor of mine once said,
If it’s not completely trivial, then it’s probably not true.
I can’t agree more with this statement. Of course, there are many sophisticated proofs in mathematics, but an overwhelming majority of (very important) facts fall in the trivial category. That being said, trivial can be sometimes relative to one’s familiarity with a subject, but that doesn’t make the sentiment any less right. Drawing up a shopping list is trivial once you’re comfortable with a pencil and paper and you know how to write (and you know what the words mean). There are certainly works of writing that require a lot more than what it takes to write a shopping list. Likewise, when we say something is trivial in mathematics, it’s because there’s no content to the proof outside of using definitions and a typical application of the basic four methods of proof. This is the “holding a pencil” part of writing a shopping list.
And as you probably know, there are many many more methods of proof than just the basic four. Proof by construction, by exhaustion, case analysis, and even picture proofs have a place in all fields of mathematics. More relevantly for programmers, there are algorithm termination proofs, probabilistic proofs, loop invariants to design and monitor, and the ubiquitous NP-hardness proofs (I’m talking about you, Travelling Salesman Problem!). There are many books dedicated to showcasing such techniques, and rightly so. Clever proofs are what mathematicians strive for above all else, and once a clever proof is discovered, the immediate first step is to try to turn it into a general method for proving other facts. Fully flushing out such a process (over many years, showcasing many applications and extensions) is what makes one a world-class mathematician.
Another difficulty faced by programmers new to mathematics is the inability to check your proof absolutely. With a program, you can always write test cases and run them to ensure they all pass. If your tests are solid and plentiful, the computer will catch your mistakes and you can go fix them.
There is no corresponding “proof checker” for mathematics. There is no compiler to tell you that it’s nonsensical to construct the set of all sets, or that it’s a type error to quotient a set by something that’s not an equivalence relation. The only way to get feedback is to seek out other people who do mathematics and ask their opinion. In solo, mathematics involves a lot of backtracking, revising mistaken assumptions, and stretching an idea to its breaking point to see that it didn’t even make sense to begin with. This is “bug hunting” in mathematics, and it can often completely destroy a proof and make one start over from scratch. It feels like writing a few hundred lines of code only to have the final program run “rm -rf *” on the directory containing it. It can be really. really. depressing.
It is an interesting pedagogical question in my mind whether there is a way to introduce proofs and the language of mature mathematics in a way that stays within a stone’s throw of computer programs. It seems like a worthwhile effort, but I can’t think of anyone who has sought to replace a classical mathematics education entirely with one based on computation.

Mathematical syntax

Another major reason programmers are unwilling to give mathematics an honest effort is the culture of mathematical syntax: it’s ambiguous, and there’s usually nobody around to explain it to you. Let me start with an example of why this is not a problem in programming. Let’s say we’re reading a Python program and we see an expression like this:
foo[2]
The nature of (most) programming languages dictates that there are a small number of ways to interpret what’s going on in here:
  1. foo could be a list/tuple, and we’re accessing the third element in it.
  2. foo could be a dictionary, and we’re looking up value associated to the key 2.
  3. foo could be a string, and we’re extracting the third character.
  4. foo could be a custom-defined object, whose __getitem__ method is defined somewhere else and we can look there to see exactly what it does.
There are probably other times this notation can occur (although I’d be surprised if number 4 didn’t by default capture all possible uses), but the point is that any programmer reading this program knows enough to intuit that square brackets mean “accessing an item inside foo with identifier 2.” Part of the reasons that programs can be very easy to read is precisely because someone had to write a parser for a programming language, and so they had to literally enumerate all possible uses of any expression form.
The other extreme is the syntax of mathematics. The daunting fact is that there is no bound to what mathematical notation can represent, and much of mathematical notation is inherently ad hoc. For instance, if you’re reading a math paper and you come across an expression that looks like this
\delta_i^j
The possibilities of what this could represent are literally endless. Just to give the unmathematical reader a taste: \delta_i could be an entry of a sequence of numbers of which we’re taking arithmetic j^\textup{th} powers. The use of the letter delta could signify a slightly nonstandard way to write the Kronecker delta function, for which \delta_i^j is one precisely when i=j and zero otherwise. The superscript j could represent dimension. Indeed, I’m currently writing an article in which I use \delta^k_n to represent k-dimensional simplex numbers, specifically because I’m relating the numbers to geometric objects called simplices, and the letter for those is  a capital \Delta. The fact is that using notation in a slightly non-standard way does not invalidate a proof in the way that it can easily invalidate a program’s correctness.
What’s worse is that once mathematicians get comfortable with a particular notation, they will often “naturally extend” or even silently drop things like subscripts and assume their reader understands and agrees with the convenience! For example, here is a common difficulty that beginners face in reading math that involves use of the summation operator. Say that I have a finite set of numbers whose sum I’m interested in. The most rigorous way to express this is not far off from programming:
Let S = \left \{ x_1, \dots, x_n \right \} be a finite set of things. Then their sum is finite:
\displaystyle \sum_{i=1}^n x_i
The programmer would say “great!” Assuming I know what “+” means for these things, I can start by adding x_1 + x_2, add the result to x_3, and keep going until I have the whole sum. This is really just a left fold of the plus operator over the list S.
But for mathematicians, the notation is far more flexible. For instance, I could say
Let S be finite. Then \sum_{x \in S} x is finite.
Things are now more vague. We need to remember that the \in symbol means “in.” We have to realize that the strict syntax of having an iteration variable i is no longer in effect. Moreover, the order in which the things are summed (which for a left fold is strictly prescribed) is arbitrary. If you asked any mathematician, they’d say “well of course it’s arbitrary, in an abelian group addition is commutative so the order doesn’t matter.” But realize, this is yet another fact that the reader must be aware of to be comfortable with the expression.
But it still gets worse.
In the case of the capital Sigma, there is nothing syntactically stopping a mathematician from writing
\displaystyle \sum_{\sigma \in \Sigma} f_{\Sigma}(\sigma)
Though experienced readers may chuckle, they will have no trouble understanding what is meant here. That is, syntactically this expression is unambiguous enough to avoid an outcry: \Sigma just happens to also be a set, and saying f_{\Sigma} means that the function f is constructed in a way that depends on the choice of the set \Sigma. This often shows up in computer science literature, as \Sigma is a standard letter to denote an alphabet (such as the binary alphabet \left \{ 0,1 \right \}).
One can even take it a step further and leave out the set we’re iterating over, as in
\displaystyle \sum_{\sigma} f_{\Sigma}(\sigma)
since it’s understood that the lowercase letter (\sigma) is usually an element of the set denoted by the corresponding uppercase letter (\Sigma). If you don’t know greek and haven’t seen that coincidence enough times to recognize it, you would quickly get lost. But programmers must realize: this is just the mathematician’s secret handshake. A mathematician would be just as bewildered and confused upon seeing some of the pointer arithmetic hacks C programmers invent, or the always awkward infinite for loop, if they had not had enough experience dealing with the syntax of standard for loops.
for (;;) {
   ;
}
In fact, a mathematician would look at this in disgust! The fact that the C programmer has need for something as pointless as an “empty statement” should be viewed as a clumsy inelegance in the syntax of the programming language (says the mathematician). Since mathematicians have the power to change their syntax at will, they would argue there’s no good reason not to change it, if it were a mathematical expression, to something simpler.
And once the paper you’re reading is over, and you start reading a new paper, chances are their conventions and notation will be ever-so-slightly different, and you have to keep straight what means what. It’s as if the syntax of a programming language changed depending on who was writing the program!
Perhaps understandably, the frustration that most mathematicians feel when dealing with varying syntax across different papers and books is collectively called “technicalities.” And the more advanced the mathematics becomes, the ability to fluidly transition between high-level intuition and technical details is all but assumed.
The upshot of this whole conversation is that the reader of a mathematical proof must hold in mind a vastly larger body of absorbed (and often frivolous) knowledge than the reader of a computer program.
At this point you might see all of this as my complaining, but in truth I’m saying this notational flexibility and ambiguity is a benefit. Once you get used to doing mathematics, you realize that technical syntax can make something which is essentially simple seem much more difficult than it is. In other words, we absolutely must have a way to make things completely rigorous, but in developing and presenting proofs the most important part is to make the audience understand the big picture, see intuition behind the symbols, and believe the proofs. For better or worse, mathematical syntax is just a means to that end, and the more abstract the mathematics becomes, the more flexiblility mathematicians need to keep themselves afloat in a tumultuous sea of notation.

You’re on your own, unless you’re around mathematicians

That brings me to my last point: reading mathematics is much more difficult than conversing about mathematics in person. The reason for this is once again cultural.
Imagine you’re reading someone else’s program, and they’ve defined a number of functions like this (pardon the single-letter variable names; as long as one is willing to be vague I prefer single-letter variable names to “foo/bar/baz”).
def splice(L):
   ...

def join(*args):
   ...

def flip(x, y):
   ...
There are two parts to understanding how these functions work. The first part is that someone (or a code comment) explains to you in a high level what they do to an input. The second part is to weed out the finer details. These “finer details” are usually completely spelled out by the documentation, but it’s still a good practice to experiment with it yourself (there is always the possibility for bugs or unexpected features, of course).
In mathematics there is no unified documentation, just a collective understanding, scattered references, and spoken folk lore. You’re lucky if a textbook has a table of notation in the appendix. You are expected to derive the finer details and catch the errors yourself. Even if you are told the end result of a proposition, it is often followed by, “The proof is trivial.” This is the mathematician’s version of piping output to /dev/null, and literally translates to, “You’re expected to be able to write the proof yourself, and if you can’t then maybe you’re not ready to continue.”
Indeed, the opposite problems are familiar to a beginning programmer when they aren’t in a group of active programmers. Why is it that people give up or don’t enjoy programming? Is it because they have a hard time getting honest help from rudely abrupt moderators on help websites like stackoverflow? Is it because often when one wants to learn the basics, they are overloaded with the entirety of the documentation and the overwhelming resources of the internet and all its inhabitants? Is it because compiler errors are nonsensically exact, but very rarely helpful? Is it because when you learn it alone, you are bombarded with contradicting messages about what you should be doing and why (and often for the wrong reasons)?
All of these issues definitely occur, and I see them contribute to my students’ confusion in my introductory Python class all the time. They try to look on the web for information about how to solve a very basic problem, and they come back to me saying they were told it’s more secure to do it this way, or more efficient to do it this way, or that they need to import something called the “heapq module.” When really the goal is not to solve the problem in the best way possible or in the shortest amount of code, but to show them how to use the tools they already know about to construct a program that works. Without a guiding mentor it’s extremely easy to get lost in the jungle of people who think they know what’s best.
As far as I know there is no solution to this problem faced by the solo programming student (or the solo anything student). And so it stands for mathematics: without others doing mathematics with you, its very hard to identify your issues and see how to fix them.

Proofs, Syntax, and Community

For the programmer who is truly interested in improving their mathematical skills, the first line of attack should now be obvious. Become an expert at applying the basic methods of proof. Second, spend as much time as it takes to clear up what mathematical syntax means before you attempt to interpret the semantics. And finally, find others who are interested in seriously learning some mathematics, and work on exercises (perhaps a weekly set) with them. Start with something basic like set theory, and write your own proofs and discuss each others’ proofs. Treat the sessions like code review sessions, and be the compiler to your partner’s program. Test their arguments to the extreme, and question anything that isn’t obvious or trivial. It’s not uncommon for easy questions with simple answers and trivial proofs to create long and drawn out discussions before everyone agrees it’s obvious. Embrace this and use it to improve.

Short of returning to your childhood and spending more time doing recreational mathematics, that is the best advice I can give.

Magic and the rise of science by Diane Purkiss

source

How could magic happen? In the pre-Christian era as in the Christian one, the primary idea was to summon an entity with astounding power – an angel, or a demon, or a god. These beings behaved more like Marvel superheroes than magicians; they could travel incredibly fast, or they had astonishing strength. But with Christianity came the problem – what did the entities themselves want? First to try to find a workaround from the idea that all magic is demonic was Albert the Great, who tried to make the astrological influence of the planets systematic. Medieval thinkers already knew that the sun’s rays could turn common soil into gold. It followed that alchemists might hope to do the same by using the rays of the planets intelligently; it also followed that “as above, so below”. Since the heavens affected life on Earth, the bite of a scorpion – for instance – might be cured by pressing the image of a scorpion into incense while the moon is correctly aligned with Scorpio. Worryingly, Scorpio itself and its many faces are, Brian Copenhaver thinks, gods and demons, but these can be used innocently because of the “as above, so below” rule; they are not working for the magician, but simply working in the way they always do.
This kind of question preoccupies most of those writers summoned by Copenhaver in The Book of Magic, an impressive and well-edited anthology. Those writers also pondered and indeed fretted over the relation between powerful entities and the laws governing matter; they raised scruples about how far the alien energy of summonable beings is connected with their usual function, or is pulled from that norm by the action of the magician. When Shakespeare’s Puck speaks of putting a girdle round the Earth in forty minutes, he is offering an explanation of what he can achieve and of how he achieves it.
To say complications ensue is putting it mildly. Schemas are confounded by efforts to find a legitimacy for magic. The English word comes ultimately from Greek magike (in which the original Persian word is spliced with tekhne, “art”), while the Persian magos “one of the members of the learned and priestly class” ultimately derives from magush, “to be able, to have power”, from which we may also derive the word “machine”. So my social hierarchy is your magic, and my magic might be your craft – or even your machinery. My religion is your magic. Your religion is my fairy lore. Or your religions might be a mass of fakery and trickery and foolery. Hence in making magic into an intellectual discipline, I theorize based on my observations, which might not be mine but those of others, heritable observations. But because what I do looks very like empiricism, as I examine materials for the tricks or fooleries, or for the real alterations, checking my results against descriptions of previous experiments, what I do feels like science, feels like the template for Baconian empiricism and its great instauration.
What fascinates Copenhaver is the overlap between magic and science. His anthology probes the moment when “the author of a scientific encyclopedia wrote that the skin of a hyena will ward off the evil eye”. Drawing on Max Weber’s idea of a “disenchanted world”, Copenhaver uses the unusual form of the anthology to trace the arc of disenchantment. Magic, Weber thought, was ritual while religion is ethical; magic coerces, but religion supplicates. Yet, by Weber’s standards, Moses and Jesus were magicians (as Christopher Marlowe also noted, allegedly saying that “Moses was but a juggler, and one Harriot, Wat Ralegh’s man, can do more than he”). Copenhaver seems unaware of other recent responses to Weberian thinking, notably Morris Berman’s The Re-enchantment of the World (1981), which takes a hard and critical look at Cartesian rationality and materialism.


 Yet Copenhaver’s choice of texts does little to unsettle the assumption that Protestants are sceptical. While Protestant polemicists went to inordinate lengths to portray their Catholic opponents as either jugglers or as actual Satan-worshippers, these endeavours did not make them appear rational, but simply fearful. Later, the idea of Catholic demon-worshippers was turned into the equally fictive idea of Catholic witch-hunters, even though many of the worst witch-hunts took place in Protestant-dominated places. For Whigs, magic eventually collapses into science, and alchemy into economics and mercantilism, while household theurgies are displaced by insurance policies and the welfare state. But what about all the people for whom this doesn’t appear to have happened?
All anthologies aim at being representative, even when they are not. With Copenhaver in hand, few of the most diligent would look for more; the volume is 643 pages long. And yet there is more, much more, and on that “more” may hinge the big shifts in magical thinking choreographed – but not explained – in these pages. He has evidently decided to omit Northern European magic, the magics of the Gaelic-speaking lands and also the seiðr and trolldomr of the Scandinavians. Admittedly, these magics are less easy to manage in the framework of the history of ideas; they are messier than Graeco-Roman magic, involving the intrusive and the invasive rather than the control and management of subordinate nature. Yet, a comparison of the transformative becoming-animal magics of British Insular myths and sagas with the alchemy derived from a symbiosis of Hellenic and Egyptian cultures might have been fruitful, allowing readers to think about the ways in which one idea of magic necessarily excludes another. Nor is Copenhaver especially interested in magical narratives, so we do not venture into German lands, except as post- Roman subjects. The result is to leave the cultural divisions of Europe very much where they were, with anywhere north and east of Paris not only marginal but absent. The rationale might be that all Western ideas of magic in high culture derive from the classical world on one hand, and the Near Eastern world on the other.
Similarly, Copenhaver is much more at home with swashbuckling intellectuals like Pico della Mirandola than he is with the average village cunning folk. Yet he does discuss the idea of fascinatores, the natural evil-eyed persons, and both he and Cornelius Agrippa struggle with intentionality: is the eye a natural phenomenon, or an intended will to harm? He cites the sceptics Burchard of Worms and Reginald Scot, both amusedly reporting folkloric practices which may have been as rational as alchemy, as well supported as astrological mineralogy. In the case of the female votaries of Diana and the English cunning folk, belief may have been underpinned by a complex web of mythological story and ideas of the liminal, the environment and a more fuzzily bounded concept of personhood which allowed people to influence one another in the manner believed to be possible for the planets and constellations.
It is telling and dispiriting that the elimination of the folkloric also turns out to eliminate women, who do not appear in the anthology at all as authors or compilers. It is wrong to imagine that nothing survives; Jayne Archer’s work has shown that many housewives had a solid working knowledge of alchemy, while the average peasant – male and female – knew all about the alleged effect of the waxing moon on growth. But in this volume, women and the lower orders feature mainly as blockages in the progress of knowledge, though only a very little more delving might have brought to light a female magic of household charms that outlived the theological speculations on which the anthology is firmly centred. While clearly lauding Scot’s braying scepticism, Copenhaver does not seem fully aware of his reception. The Bodleian Library copy of Reginald Scot’s Discoverie of Witchcraft (Bodley MS. Add B1) was used by a cunning magician, who evidently took Scot’s meticulous reportage of popular magic as a sourcebook for his own practice of it. Scot himself drew on a manuscript called Secretum secretorum, itself the grimoire of a pair of cunning men. The weaving in and out of the learned and the popular illustrates that only naivety would seek to separate them from one another.
Another problem with Copenhaver’s scope is a mild tendency to round up some rather usual suspects. Three of the eleven subsections of this anthology are made up of texts which its likely readership probably already owns, including the Old and New Testaments in modern translations, and the literary works of well-known poets and dramatists such as Marlowe, Spenser, Jonson, Shakespeare and Molière. Copenhaver could have chosen far less well-known dramas by Peele, Middleton, Dekker, Rowley and Ford, with far more illuminating results for the general reader. A good deal of the demonology here is well known, too, with the Malleus Maleficarum hiked into an unduly prominent position; we leap into that intellectual catastrophe suddenly, rather than working our way to it via the Formicarius and the heresy trials and the particularities of Alpine folklore. The result is that the witch trials spring out as an unpredictable mass of unreason rather than as a development from reason itself in general and scholasticism in particular.
Where the collection really shines is its compilation of mage thinking, the posh university-educated magic deriving ultimately from texts like The Emerald Tablet and On the Mysteries of Egypt; texts which in turn took their prestige from misunderstood hieroglyphics and mistranscribed or misunderstood top slices of Zoroastrianism. Yet mages hoped for true hearing. When Marcilio Ficino imagined magic as the act of singing to the stars, he also imagined the spirit imitating their movements, and that imitation could then move through warm breath to the ear of a listener, inspiring further imitation. But was this really magic at all, or did it simply feel magical? The same kind of question might be asked of Pietro Pomponazzi, who spoke of magnetism as real but argued that it was caused not by demons, but by the planets. After reason, the anti-magicians used satire, in rather the manner of Richard Dawkins duping the tarot card reader in his documentary film Slaves to Superstition; nobody wanted to be like the dupe, the victim of the cold reading. “I want to take on the enemies of reason” is a ringing call, but the awkward fact that emerges here is that the religious impetus to define magic led to a number of important scientific discoveries and breakthroughs. Magic, science and religion are entangled, and reading this rich collection allows each of us to cut our own path through the dense snarls of the history of a body of ideas which ultimately came to carry a heavy body count.
Diane Purkiss is Professor of English at Keble College, Oxford.