Primavera De Filippi is an expert in blockchain-based tech. She is giving a ThursdAI talk on Plantoid, an event held by Harvard’s Berkman Klein Center for Internet & Society and the MIT Media Lab. Her talk is officially on operational autonomy vs. decisional autonomy, but it’s really about how weird things become when you build a computerized flower that merges AI and the blockchain. For me, a central question of her talk was: Can we have autonomous robots that have legal rights and can own and spend assets, without having to resort to conferring personhood on them the way we have with corporations?

Autonomy and liability

She begins by pointing to the 3 industrial revolutions so far: Steam led to mechanized production ; Electricity led to mass production; Electronics led to automated production. The fourth — AI — is automating knowledge production.

People are increasingly moving into the digital world, and digital systems are moving back into the physical worlds, creating cyber-physical systems. E.g., the Internet of Things senses, communicates, and acts. The Internet of Smart Things learns from the data the things collect, makes inferences, and then acts. The Internet of Autonomous Things creates new legal challenges. Various actors can be held liable: manufacturer, software developer, user, and a third party. “When do we apply legal personhood to non-humans?”

With autonomous things, the user and third parties become less liable as the software developer takes on more of the liability: There can be a bug. Someone can hack into it. The rules that make inferences are inaccurate. Or a bad moral choice has led the car into an accident.

The sw developer might have created bug-free sw but its interaction with other devices might lead to unpredictability; multiple systems operating according to different rules might be incompatible; it can be hard to identify the chain of causality. So, who will be liable? The manufacturers and owners are likely to have only limited liability.

Or, perhaps we will provide some form of legal personhood to machines so the manufacturers can be sued for their failings. Suing a robot would be like suing a corporation. The devices would be able to own property and assets. The EU is thinking about creating this type of agenthood for AI systems. This is obviously controversial. At least a corporation has people associated with it, while the device is just a device, Primavera points out.

So, when do we apply legal personhood to non-humans? In addition to people and corporations, some countries have assigned personhood to chimpanzees (Argentina, France) and to natural resources (NZ: Whanganui river). We do this so these entities will have rights and cannot be simply exploited.

If we give legal personhood to AI-based systems, can AI have property rights over their assets and IP? If they are legally liable, they can be held responsible for their actions, and can be sued for compensation? “Maybe they should have contractual rights so they can enter into contracts. Can they be rewarded for their work? Taxed?”Maybe they should have contractual rights so they can enter into contracts. Can they be rewarded for their work? Taxed? [All of these are going to turn out to be real questions. … Wait for it …]

Limitations: “Most of the AI-based systems deployed today are more akin to slaves than corporations.” They’re not autonomous the way people are. They are owned, controlled and maintained by people or corporations. They act as agents for their operators. They have no technical means to own or transfer assets. (Primavera recommends watching the Star Trek: The Next Generation episode “The Measure of the Man” that asks, among other things, whether Data (the android) can be dismantled and whether he can resign.)

Decisional autonomy is the capacity to make a decision on your own, but it doesn’t necessarily bring what we think of as real autonomy. E.g., an AV can decide its route. For real autonomy we need operational autonomy: no one is maintaining the thing’s operation at a technical level. To take a non-random example, a blockchain runs autonomously because there is no single operator controlling. E.g., smart contracts come with a guarantee of execution. Once a contract is registered with a blockchain, no operator can stop it. This is operational autonomy.

Blockchain meets AI. Object: Autonomy

We are getting first example of autonomous devices using blockchain. The most famous is the Samsung washing machine that can detect when the soap is empty, and makes a smart contract to order more. Autonomous cars could work with the same model; they could not be owned by anyone and collect money when someone uses them. These could be initially purchased by someone and then buy themselves off: “They’d have to be emancipated,” she says. Perhaps they and other robots can use the capital they accumulate to hire people to work for them. [Pretty interesting model for an Uber.]

She introduces Plantoid, a blockchain-based life form. “Plantoid is autonomous, self-sufficient, and can reproduce.”It’s autonomous, self-sufficient, and can reproduce. Real flowers use bees to reproduce. Plantoids use humans to collect capital for their reproduction. Their bodies are mechanical. Their spirit is an Ethereum smart contract. It collects cryptocurrency. When you feed it currency it says thank you; the Plantoid Primavera has brought, nods its flower. When it gets enough funds to reproduce itself, it triggers a smart contract that activates a call for bids to create the next version of the Plantoid. In the “mating phase” it looks for a human to create the new version. People vote with micro-donations. Then it identifies a winner and hires that human to create the new one.

There are many Plantoids in the world. Each has its own “DNA”. New artists can add to it. E.g., each artist has to decide on its governance, such as whether it will donate some funds to charity. The aim is to make it more attractive to be contributed to. The most fit get the most money and reproduces themselves. BurningMan this summer is going to feature this.

Every time one reproduces, a small cut is given to the pattern that generated it, and some to the new designer. This flips copyright on its head: the artist has an incentive to make her design more visible and accessible and attractive.

So, why provide legal personhood to autonomous devices? We want them to be able to own their own assets, to assume contractual rights, and legal capacity so they can sue and be sued, and limit their liability. “ Blockchain lets us do that without having to declare the robot to be a legal person.” Blockchain lets us do that without having to declare the robot to be a legal person.

The plant effectively owns the cryptofunds. The law cannot affect this. Smart contracts are enforced by code

Who are the parties to the contract? The original author and new artist? The master agreement? Who can sue who in case of a breach? We don’t know how to answer these questions yet.

Can a plantoid sure for breach of contract? Not if the legal system doesn’t recognize them as legal persons. So who is liable if the plant hurts someone? Can we provide a mechanism for this without conferring personhood? “How do you enforce the law against autonomous agents that cannot be stopped and whose property cannot be seized?”

Q&A

Could you do this with live plants? People would bioengineer them…

A: Yes. Plantoid has already been forked this way. There’s an idea for a forest offering trees to be cut down, with the compensation going to the forest which might eventually buy more land to expand itself.

My interest in this grew out of my interest in decentralized organizations. This enables a project to be an entity that assumes liability for its actions, and to reproduce itself.

Q: [me] Do you own this plantoid?

A: Hmm. I own the physical instantiation but not the code or the smart contract. If this one broke, I could make a new one that connects to the same smart contract. If someone gets hurt because it falls on the, I’m probably liable. If the smart contract is funding terrorism, I’m not the owner of that contract. The physical object is doing nothing but reacting to donations.

Q: But the aim of its reactions is to attract more money…

A: It will be up to the judge.

Q: What are the most likely senarios for the development of these weird objects?

A: A blockchain can provide the interface for humans interacting with each other without needing a legal entity, such as Uber, to centralize control. But you need people to decide to do this. The question is how these entities change the structure of the organization.

In 2016, the COMPAS algorithm
became a household name (in some households) when ProPublica showed that it predicted that black men were twice as likely as white men to jump bail. People justifiably got worried that algorithms can be highly biased. At the same time, we think that algorithms may be smarter than humans, Ben says. These have been the poles of the discussion. Optimists think that we can limit the bias to take advantage of the added smartness.

There have been movements to go toward risk assessments for bail, rather than using money bail. E.g., Rand Paul and Kamala Harris have introduced the Pretrial Integrity and Safety Act of 2017. There have also been movements to use scores only to reduce risk assessments, not to increase them.

But are we asking the right questions? Yes, the criminal justice system would be better if judges could make more accurate and unbiased predictions, but it’s not clear that machine learning can do this. So, two questions: 1. Is ML an appropriate tool for this. 2. Is implementing MK algorithms an effective strategy for criminal justice reform?

#1 Is ML and appropriate tool to help judges make more accurate and unbiased predictions?

ML relies on data about the world. This can produce tunnel vision by causing us to focus on particular variables that we have quantified, and ignore others. E.g., when it comes to sentencing, a judge balances deterrence, rehabilitation, retribution, and incapacitating a criminal. COMPAS predicts recidivism, but none of the other factors. This emphasizes incapacitation as the goal of sentencing. This might be good or bad, but the ML has shifted the balance of factors, framing the decision without policy review or public discussion.

Q: Is this for sentencing or bail? Because incapacitation is a more important goal in sentencing than in bail.

A: This is about sentencing. I’ll be referring to both.

Data is always about the past, Ben continues. ML finds statistical correlations among inputs and outputs. It applies those correlations to the new inputs. This assumes that those correlations will hold in the future; it assumes that the future will look like the past. But if we’re trying reform the judicial system, we don’t want the future to look like the past. ML can thus entrench historical discrimination.

Arguments about the fairness of COMPAS are often based on competing mathematical definitions of fairness. But we could also think about the scope of what we couint as fair. ML tries to make a very specific decision: among a population, who recidivates? If you take a step back and consider the broader context of the data and the people, you would recognize that blacks recidivate at a higher rate than whites because of policing practices, economic factors, racism, etc. Without these considerations, you’re throwing away the context and accepting the current correlations as the ground truth. Even if we were to change the base data, the algorithm wouldn’t make the change, unless you retrain it.

Q: Who retrains the data?

A: It depends on the contract the court system has.

Algorithms are not themselves a natural outcome of the world. Subjective decisions go into making them: which data to input, choosing what to predict, etc. The algorithms are brought into court as if they were facts. Their subjectivity is out of the frame. A human expert would be subject to cross examination. We should be thinking of algorithms that way. Cross examination might include asking how accurate the system is for the particular group the defendant is in, etc.

Q: These tools are used in setting bail or a sentence, i.e., before or after a trial. There may not be a venue for cross examination.

In the Loomis case, an expert witness testified that the algorithm was misused. That’s not exactly what I’m suggesting; they couldn’t get to all of it because of the trade secrecy of the algorithms.

Back to the framing question. If you can make the individual decision points fair we sometimes think we’ve made the system fair. But technocratic solutions tend to sanitize rather than alter. You’re conceding the overall framework of the system, overlooking more meaningful changes. E.g., in NY, 71% of voters support ending pre-trial jail for misdemeanors and non-violent felonies. Maybe we should consider that. Or, consider that cutting food stamps has been shown to increases recidivism. Or perhaps we should be reconsidering the wisdom of preventative detention, which was only introduced in the 1980s. Focusing on the tech de-focuses on these sorts of reforms.

Also, technocratic reforms are subject to political capture. E.g., NJ replaced money bail with a risk assessment tool. After some of the people released committed crimes, they changed the tool so that certain crimes were removed from bail. What is an acceptable risk level? How to set the number? Once it’s set, how is it changed?

Q: [me] So, is your idea that these ML tools drive out meaningful change, so we ought not to use them?

A: Roughly, yes.

[Much interesting discussion which I have not captured. E.g., Algorithms can take away the political impetus to restore bail as simply a method to prevent flight. But sentencing software is different, and better algorithms might help, especially if the algorithms are recommending sentences but not imposing them. And much more.]

2. Do algorithms actually help?

How do judges use algorithms to make a decision? Even if the algorithm were perfect, would it improve the decisions judges make? We don’t have much of an empirical answer.

Ben was talking to Jeremy Heffner at Hunch Lab. They make predictive policing software and are well aware of the problem of bias. (“If theres any bias in the system it’s because of the crime data. That’s what we’re trying to address.” — Heffner) But all of the suggestions they give to police officers are called “missions,” which is in the military/jeopardy frame.

People are bad at incorporating quantitative data into decisions. And they filter info through their biases. E.g., the “ban the box” campaign to remove the tick box about criminal backgrounds on job applications actually increased racial discrimination because employers assumed the white applicants were less likely to have arrest records. (Agan and Starr 2016) Also, people have been shown to interpret police camera footage according to their own prior opinions about the police. (sommers 2016)

Evidence from Kentucky (Stevenson 2018): after mandatory risk assessments for bail only made a small increase in pretrial release, and these changes eroded over time as judges returned to their previous habits.

So, we need to be asking the empirical question of how judges actual use these decisions. And should judges incorporate these predictions into their decisions?

Ben’s been looking at the first question:L how do judges use algorithmic predictions? He’s running experiments on Mechanical Turk showing people profiles of defendants — a couple of sentences about the crime, race, previous record arrest record. The Turkers have to give a prediction of recidivism. Ben knows which ones actually recidivated. Some are also given a recommendation based on an algorithmic assessment. That risk score might be the actual one, random, or biased; the Turkers don’t know that about the score.

Q: It might be different if you gave this test to judges.

A: Yes, that’s a limitation.

Q: You ought to give some a percentage of something unrelated, e.g., it will rain, just to see if the number is anchoring people.

A: Good idea

Q: [me] Suppose you find that the Turkers’ assessment of risk is more racially biased than the algorithm…

We’ve all heard now about AI-based algorithms that are being used to do risk assessments in pretrial bail decisions. She thinks this is a good place to start using algorithms, although it’s not easy.

The pre-trial stage is supposed to be very short. The court has to determine if the defendant, presumed innocent, will be released on bail or jailed. The sole considerations are supposed to be whether the def is likely to harm someone else or flee. Preventive detention has many efffects, mostly negative for the defendant.
(The US is a world leader in pre-trial detainees. Yay?)

Risk assessment tools have been used for more than 50 years. Actuarial tools have shown greater predictive power than clinical judgment, and can eliminate some of the discretionary powers of judges. Use of these tools have long been controversy What type of factors to include in the power? Is the use of demographic factors to make predictions fair to individuals?

Existing tools use regression analysis. Now machine learning can learn from much more data. Mechanical predictions [= machine learning] are more accurate than statistical predictions, but may not be explicable.

We think humans can explain their decisions and we want machines to be able to as well. But look at movie reviews. Humans can tell if a review is positive. We can teach which words are positive or negative, getting 60% accuracy. Or we can have a human label the reviews as positive or negative and let the machine figure out what the factor are — via machine leaning — in which case we get 80% accuracy but may lose explicability.

With pretrial situations, what is the automated task is that the machine should be performing?

There’s a tension between accuracy and fairness. Computer scientists are trying to quantify these questions What does a fair algorithm look like? John Kleinberg and colleagues did a study of this [this one?]. Their algorithms reduced violent crime by 25% with no change in jailing rates, without increasing racial disparities. In short, the algorithm seems to have done a more accurate job with less bias.

Doaa goes through questions that should be asked of these tools, beginning with: Which factors are considered in each? [She dives into the details for all four tools. I can’t capture it. Sorry.]

What are the sources of data? (3 out of 4 rely on interviews and databases.)

What is the quality of the data? “This is the biggest problem jurisdictions are dealing with when using such a tool.” “Criminal justice data is notoriously poor.” And, of course, if a machine learning system is trained on discriminatory data, its conclusions are likely to reflect those biases.

The tools neeed to be periodically validated using data from its own district’s population. Local data matters.

There should be separate scores for flight risk and public safety All but the PSA provide only a single score. This is important because there are separate remedies for the two concerns. E.g., you might want to lock up someone who is a risk to public safety, but take away the passport of someone who is a flight risk.

Finally, the systems should discriminate among reasons for flight risk. E.g., because the defendant can’t afford the cost of making it to court or because she’s fleeing?

Conclusion: Pretrial is the front door of the criminal justice system and affects what happens thereafter. Risk assessment tools should not replace judges, but they bring benefits. They should be used, and should be made as transparent as possible. There are trade offs. The tool will not eliminate all bias but might help reduce it.

Q&A

Q: Do the algorithms recognize the different situations of different defendants?

A: Systems do recognize this, but not in sophisticated ways. That’s why it’s important to understand why a defendant might be at risk of missing a court date. Maybe we could provide poor defendants with a Metro card.

Q: Could machine learning be used to help us be more specific in the types of harm? What legal theories might we drawn on to help with this?

A: [The discussion got too detailed for me to follow. Sorry.]

Q: There are different definitions of recidivism. What do we do when there’s a mismatch between the machines and the court?

A: Some states give different weights to different factors based on how long ago the prior crimes were committed. I haven’t seen any difference in considering how far ahead the risk of a possible next crime is.

Q: [me] While I’m very sympathetic to allowing machine learning to be used without always requiring that the output be explicable, when it comes to the justice system, do we need explanations so not only is justice done, but we can have trust that it’s being done?

A: If we can say which factors are going into a decision — and it’s not a lot of them — if the accuracy rate is much higher than manual systems, then maybe we can give up on always being able to explain exactly how it came to its decisions. Remember, pre-trial procedures are short and there’s usually not a lot of explaining going on anyway. It’s unlikely that defendants are going to argue over the factors used.

Q: [me] Yes, but what about the defendant who feels that she’s being treated differently than some other person and wants to know why?

A: Judges generally don’t explain how they came to their decisions anyway. The law sets some general rules, and the comparisons between individuals is generally within the framework of those rules. The rules don’t promise to produce perfectly comparable results. In fact, you probably can’t easily find two people with such similar circumstances. There are no identical cases.

Q: Machine learning, multilevel regression level, and human decision making all weigh data and produce an outcome. But ML has little human interaction, statistical analysis has some, and the human decision is all human. Yet all are in fact algorithmic: the judge looks at a bond schedule to set bail. Predictability as fairness is exacerbated by the human decisions since the human cannot explain her model.

Q: Did you find any logic about why jurisdictions picked which tool? Any clear process for this?

A: It’s hard to get that information about the procurement process. Usually they use consultants and experts. There’s no study I know of that looks at this.

Q: In NZ, the main tool used for risk assessment for domestic violence is a Canadian tool called ODARA. Do tools work across jurisdictions? How do you reconcile data sets that might be quite different?

A: I’m not against using the same system across jurisdictions — it’s very expensive to develop one from scratch — but they need to be validated. The federal tool has not been, as far as I know. (It was created in 2009.) Some tools do better at this than others.

Q: What advice would you give to a jurisdiction that might want to procure one? What choices did the tools make in terms of what they’re optimized for? Also: What about COMPAS?

A: (I didn’t talk about COMPAS because it’s notorious and not often used in pre-trial, although it started out as a pre-trial tool.) The trade off seems to be between accuracy and fairness. Policy makers should define more strictly where the line should be drawn.

Q: Who builds these products?

A: Three out of the four were built in house.

Q: PSA was developed by a consultant hired by the Arnold Foundation. (She’s from Luminosity.) She has helped develop a number of the tools.

Q: Why did you decide to research this? What’s next?

A: I started here because pre-trial is the beginning of the process. I’m interested in the fairness question, among other things.

Q: To what extent are the 100+ factors that the Colorado tool considers available publicly? Is their rationale for excluding factors public? Because they’re proxies for race? Because they’re hard to get? Or because back then 100+ seemed like too many? And what’s the overlap in factors between the existing systems and the system Kleinberg used?

A: Interviewing defendants takes time, so 100 factors can be too much. Kleinberg only looked at three factors. Another tool relied on six factors.

Q: Should we require private companies to reveal their algorithms?

A: There are various models. One is to create an FDA for algorithms. I’m not sure I support that model. I think private companies need to expose at least to the govt the factors that they’re including. Others would say I’m too optimistic about the government.

Q: In China we don’t have the pre-trial part, but there’s an article saying that they can make the sentencing more fair by distinguishing among crimes. Also, in China the system is more uniform so the data can be aggregated and the system can be made more accurate.

A: Yes, states are different because they have different laws. Exchanging data between states is not very common and may not even be possible.

[Disclosure: Typical conversations about JP, when he’s not present, attempt — and fail — to articulate his multi-facted awesomeness. I’ll fail at this also, so I’ll just note that JP is directly responsible for my affiliation with the BKC and and for my co-directorship of the Harvard Library Innovation Lab…and those are just the most visible ways in which he has enabled me to flourish as best I can. ]

Also, at the end of this post I have some reflections on rules vs. models, and the implicit vs. explicit.

John begins by framing the book as an attempt to find a balance between diversity and free expression. Too often we have pitted the two against each other, especially in the past few years, he says: the left argues for diversity and the right argues for free expression. It’s important to have both, although he acknowledges that there are extremely hard cases where there is no reconciliation; in those cases we need rules and boundaries. But we are much better off when we can find common ground.

“This may sound old-fashioned in the liberal way. And that’s true,” he says. But we’re having this debate in part because young people have been advancing ideas that we should be listening to. We need to be taking a hard look.

Our institutions should be deeply devoted to diversity, equity and inclusion. Our institutions haven’t been as supportive of these as they should be, although they’re getting better at it, e.g. getting better at acknowledging the effects of institutional racism.

The diversity argument pushes us toward the question of “safe spaces.” Safe spaces are crucial in the same way that every human needs a place where everyone around them supports them and loves them, and where you can say dumb things. We all need zones of comfort, with rules implicit or explicit. It might be a room, a group, a virtual space… E.g., survivors of sexual assault need places where they know there are rules and they can express themselves without feeling at risk.

But, John adds, there should also be spaces where people are uncomfortable, where their beliefs are challenged.

Spaces of both sorts are experienced differently by different people. Privileged people like John experience spaces as safe that others experience as uncomfortable.

The examples in his book include: trigger warnings, safe spaces, the debates over campus symbols, the disinvitation of speakers, etc. These are very hard to navigate and call out for a series of rules or principles. Different schools might approach these differently. E.g.,students from the Gann Academy are here tonight, a local Jewish high school. They well might experience a space differently than students at Andover. Different schools well might need different rules.

Now John turns it over to students for comments. (This is very typical JP: A modest but brilliant intervention and then a generous deferral to the room. I had the privilege of co-teaching a course with him once, and I can attest that he is a brilliant, inspiring teacher. Sorry, but to be such a JP fanboy, but I am at least an evidence-based fanboy.) [I have not captured these student responses adequately, in some cases simply because I had trouble hearing them. They were remarkable, however. And I could not get their names with enough confidence to attempt to reproduce them here. Sorry!]

Student Responses

Student: I graduated from Andover and now I’m at Harvard. I was struck by the book’s idea that we need to get over the dichotomy between diversity and free expression. I want to address Chapter 5, about hate speech. It says each institution ought to assess its own values to come up with its principles about speech and diversity, and those principles ought to be communicated clearly and enforced consistently. But, I believe, we should in fact be debating what the baseline should be for all institutions. We don’t all have full options about what school we’re going to go to, so there ought to be a baseline we all can rely on.

JP: Great critique. Moral relativism is not a good idea. But I don’t think one size fits all. In the hardest cases, there might be sharpest limits. But I do agree there ought to be some sort of baseline around diversity, equity, and inclusion. I’d like to see that be a higher baseline, and we’ve worked on this at Andover. State universities are different. E.g., if a neo-Nazi group wants to demonstrate on a state school campus and they follow the rules laid out in the Skokie case, etc., they should be allowed to demonstrate. If they came to Andover, we’d say no. As a baseline, we might want to change the regulations so that the First Amendment doesn’t apply if the experience is detrimental to the education of the students; that would be a very hard line to draw. Even if we did, we still might want to allow local variations.

Student: Brave spaces are often build from safe spaces. E.g., at Andover we used Facebook to build a safe space for women to talk, in the face of academic competitions where misogyny was too common. This led to creating brave places where open, frank discussion across differences was welcomed.

JP: Yes, giving students a sense of safety so they can be brave is an important point. And, yes, brave spaces do often grow from safe spaces.

Andover student: I was struck by why diversity is important: the cross-pollination of ideas. But from my experience, a lot of that hasn’t occurred because we’re stuck in our own groups. There’s also typically a divide between the students and the faculty. Student activitsts are treated as if they’re just going through a phase. How do we bridge that gap?

JP: How do we encourage more cross-pollination? It’s a really hard problem for educators. I’ve been struck by the difference between teaching at Harvard Law and Andover in terms of the comfort with disagreeing across political divides; it was far more comfortable at the Law School. I’ve told students if you present a paper that disagrees with my point of view and argues for it beautifully, you’ll do better than parroting ideas back to me. Second, we have to stop using demeaning language to talk about student activists. BTW, there is an interesting dynamic, as teachers today may well have been activists when they were young and think of themselves as the reformers.

Student: [hard to hear] At Andover, our classes were seminar-based, which is a luxury not all students have. Also: Wouldn’t encouraging a broader spread of ideas create schisms? How would you create a school identity?

JP: This echoes the first student speaker’s point about establishing a baseline. Not all schools can have 12 students with two teachers in a seminar, as at Andover. We need to find a dialectic. As for schisms: we have to communicate values. Institutions are challenged these days but there is a huge place for them as places that convey values. There needs to be some top down communication of those values. Students can challenge those values, and they should. This gets at the heart of the problem: Do we tolerate the intolerant?

Student: I’m a graduate of Andover and currently at Harvard. My generation has grown up with the Internet. What happens when what is supposed to be a safe space becomes a brave space for some but not all? E.g., a dorm where people speak freely thinking it’s a safe space. What happens when the default values overrides what someone else views as comfortable? What is the power of an institution to develop, monitor, and mold what people actually feel? When communities engage in groupthink, how can an institution construct space safes?

JP: I don’t have an easy answer to this. We do need to remember that these spaces are experienced differently by different people, and the rules ought to reflect this. Some of my best learning came from late night bull sessions. It’s the duty of the institution to do what it can to enable that sort of space. But we also have to recognize that people who have been marginalized react differently. The rule sets need to reflect that fact.

Student: Andover has many different forum spaces available, from hallways to rooms. We get to decide to choose when and where these conversations will occur. For a more traditional public high school where you only have 30-person classroom as a forum, how do we have the difficult conversations that students at Andover choose to have in more intimate settings?

JP: The size and rule-set of the group matters enormously. Even in a traditional HS you can still break a class into groups. The answer is: How do you hack the space?

Student: I’m a freshman at Harvard. Before the era of safe spaces, we’d call them friends: people we can talk with and have no fear that our private words will be made public, and where we will not be judged. Safe spaces may exclude people, e.g., a safe space open only to women.

JP Andover has a group for women of color. That excludes people, and for various reasons we think that’s entirely appropriate an useful.

Q&A

Q [Terry Fisher]: You refer frequently to rule sets. If we wanted to have a discussion in a forum like this, you could announce a set of rules. Or the organizer could announce values, such as: we value respect, or we want people to take the best version of what others say. Or, you could not say anything and model it in your behavior. When you and I went to school, there were no rules in classrooms. It was all done by modeling. But this also meant that gender roles were modeled. My experience of you as a wonderful teacher, JP, is that you model values so well. It doesn’t surprise me that so many of your students talk with the precision and respectfulness that you model. I am worried about relying on rule sets, and doubt their efficacy for the long term. Rather, the best hope is people modeling and conveying better values, as in the old method.

JP: Students, Terry Fischer was my teacher. May answer will be incredibly tentative: It is essential for an institution to convey its values. We do this at Andover. Our values tell us, for example, that we don’t want gender-based balance and are aware that we are in a misogynist culture, and thus need reasonable rules. But, yes, modeling is the most powerful.

Q [Dorothy Zinberg]: I’ve been at Harvard for about 70 yrs and I have seen the importance of an individual in changing an institution. For example, McGeorge Bundy thought he should bring 12 faculty to Harvard from non-traditional backgrounds, including Erik Erikson who did not have a college degree. He had been a disciple of Freud’s. He taught a course at Harvard called “The Lifecycle.” Every Harvard senior was reading The Catcher in the Rye. Erikson was giving brilliant lectures, but I told him it was from his point of view as a man, and had nothing to do with the young women. So, he told me, a grad student, to write the lectures. No traditional professor would have done that. Also: for forming groups, there’s nothing like closing the door. People need to be able to let go and try a lot of ideas.

Q: I am from the Sudan. How do you create a safe space in environments that are exclusive. [I may have gotten that wrong. Sorry.] How do you acknowledge the native American tribes whose land this institution is built on, or the slaves who did the building?

JP: We all have that obligation. [JP gives some examples of the Law School recently acknowledging the slave labor, and the money from slave holders, that helped build the school.]

Q: You used a kitchen as an example of a safe space. Great example. But kitchens are not established or protected by any authority. It’s a new idea that institutions ought to set these up. Do you think there should be safe spaces that are privately set up as well as by institutions? Should some be permitted to exclude people or not?

(JP asks a student to respond): Institutional support can be very helpful when you have a diversity of students. Can institutional safe spaces supplement private ones? I’m not sure. And I do think exclusive groups have a place. As a consensus forms, it’s important to allow the marginalized voices to connect.

Q [ head of Gann]: I’m a grad of Phillips Academy. As head of a religious school, we’re struggling with all these questions. Navigating these spaces isn’t just a political or intellectual activity. It is a work of the heart. If the institution thinks of this only as a rational activity and doesn’t tend to the hearts of our students, and is not explicit about the habits of heart we need to navigate these sensitive waters, only those with natural emotional skills will be able to flourish. We need to develop leaders who can turn hard conversations into generative ones. What would it look like to take on the work of developing social and emotional development?

JP: Ive been to Gann and am confident that’s what you’re doing. And you can see evidence of Andover’s work on it in the students who spoke tonight. Someone asked me if a student became a Nazi, would you expel him? Yes, if it were apparent in his actions, but probably not for his thoughts. Ideally, our students won’t come to have those views because of the social and emotional skills they’re learning. But people in our culture do have those views. Your question brings it back to the project of education and of democracy.

[This session was so JP!]

A couple of reactions to this discussion without having yet read the book.

First, about Prof. Fisher’s comment: I think we are all likely to agree that modeling the behavior we want is the most powerful educational tool. JP and Prof. Fisher, are both superb, well, models of this.

But, as Prof. Fisher noted in his question, the dominant model of discourse for our generation silently (and sometimes explicitly) favored males, white middle class values, etc. Explicit rules weren’t as necessary because we had internalized them and had stacked the deck against those who were marginalized by them. Now that diversity has thankfully become an explicit goal, and now that the Internet has thrown us into conversations across differences, we almost always need to make those rules explicit; a conversation among people from across divides of culture, economics, power, etc. that does not explicitly acknowledge the different norms under which the participants operate is almost certainly going to either fragment or end in misunderstanding.

(Clay Shirky and I had a collegial difference of opinion about this about fifteen years ago. Clay argued for online social groups having explicit constitutions. I argued
for the importance of the “unspoken” in groups, and the damage that making norms explicit can cause.)

Second, about the need for setting a baseline: I’m curious to see what JP’s book says about this, because the evidence is that we as a culture cannot agree about what the baseline is: vociferous and often nasty arguments about this have been going on for decades. For example, what’s the baseline for inviting (or disinviting) people with highly noxious views to a private college campus? I don’t see a practical way forward for establishing a baseline answer. We can’t even get Texas schools to stop teaching Creationism.

So, having said that modeling is not enough, and having despaired at establishing a baseline, I think I am left being unhelpfully dialectical:

1. Modeling is essential but not enough.

2. We ought to be appropriately explicit about rules in order to create places where people feel safe enough to be frank and honest…

3. …But we are not going to be able to agree on a meaningful baseline for the U.S., much less internationally — “meaningful” meaning that it is specific enough that it can be applied to difficult cases.

4. But modeling may be the only way we can get to enough agreement that we can set a baseline. We can’t do it by rules because we don’t have enough unspoken agreement about what those rules should be. We can only get to that agreement by seeing our leading voices in every field engage across differences in respectful and emotionally truthful ways. So at the largest level, I find I do agree with Prof. Fisher: we need models.

5. But if our national models are to reflect the values we want as a baseline, we need to be thoughtful, reflective, and explicit about which leading voices we want to elevate as models. We tend to do this not by looking for rules but by looking for Prof. Fisher’s second alternative: values. For example, we say positively that we love John McCain’s being a “maverick” or Kamala Harris’ careful noting of the evidence for her claims, and we disdain Trump’s name-calling. Rules derive from values such as those. Values come before rules.

I just wish I had more hope about the direction we’re going in…although I do see hopeful signs in some of the model voices who are emerging, and most of all, in the younger generation’s embrace of difference.