Archive for the ‘Programming’ Category

A century of research shows that traditional grammar lessons—those hours spent diagramming sentences and memorizing parts of speech—don’t help and may even hinder students’ efforts to become better writers. Yes, they need to learn grammar, but the old-fashioned way does not work.

This finding—confirmed in 1984, 2007, and 2012 through reviews of over 250 studies—is consistent among students of all ages, from elementary school through college. For example, one well-regarded study followed three groups of students from 9th to 11th grade where one group had traditional rule-bound lessons, a second received an alternative approach to grammar instruction, and a third received no grammar lessons at all, just more literature and creative writing. The result: No significant differences among the three groups—except that both grammar groups emerged with a strong antipathy to English.

There is a real cost to ignoring such findings. In my work with adults who dropped out of school before earning a college degree, I have found over and over again that they over-edit themselves from the moment they sit down to write. They report thoughts like “Is this right? Is that right?” and “Oh my god, if I write a contraction, I’m going to flunk.” Focused on being correct, they never give themselves a chance to explore their ideas or ways of expressing those ideas. Significantly, this sometimes-debilitating focus on “the rules” can be found in students who attended elite private institutions as well as those from resource-strapped public schools.
…
(Three out of five links here are pay-per-view. Sorry.)

It’s only a century of research. Don’t want to rush into anything. 😉

How would you adapt this finding to teaching programming and/or hacking?

There are a ton of amazing debugging tools for Linux that I love. strace! perf! tcpdump! wireshark! opensnoop! I think a lot of them aren’t as well-known as they should be, so I’m making a friendly zine explaining them.

Donate, subscribe (PDF or paper)!

If you follow Julia’s blog (http://jvns.ca) or twitter (@b0rk), you know what a treat the zine will be!

If you don’t (correct that error now) and consider the following sample:

It’s possible there are better explanations than Julia’s, so if and when you see one, sing out!

Felienne is always learning. In exploring her PhD dissertation and her public speaking experience it’s clear that she has no intent on stopping! Most recently she’s been exploring a large corpus of Scratch programs looking for Code Smells. How do children learn how to code, and when they do, does their code “smell?” Is there something we can do when teaching to promote cleaner, more maintainable code?

Felienne discusses a paper due to appear in September on analysis of 250K Scratch programs for code smells.

To tap into the power of Python’s open data science stack—including NumPy, Pandas, Matplotlib, Scikit-learn, and other tools—you first need to understand the syntax, semantics, and patterns of the Python language. This report provides a brief yet comprehensive introduction to Python for engineers, researchers, and data scientists who are already familiar with another programming language.

Author Jake VanderPlas, an interdisciplinary research director at the University of Washington, explains Python’s essential syntax and semantics, built-in data types and structures, function definitions, control flow statements, and more, using Python 3 syntax.

You’ll explore:

Python syntax basics and running Python code

Basic semantics of Python variables, objects, and operators

Built-in simple types and data structures

Control flow statements for executing code blocks conditionally

Methods for creating and using reusable functions

Iterators, list comprehensions, and generators

String manipulation and regular expressions

Python’s standard library and third-party modules

Python’s core data science tools

Recommended resources to help you learn more

Jake VanderPlas is a long-time user and developer of the Python scientific stack. He currently works as an interdisciplinary research director at the University of Washington, conducts his own astronomy research, and spends time advising and consulting with local scientists from a wide range of fields.

A few times a year, I have the job of teaching a bunch of people who have never written code before how to program from scratch. The nature of programming being what it is, the same error crop up every time in a very predictable pattern. I usually encourage my students to go through a step-by-step troubleshooting process when trying to fix misbehaving code, in which we go through these common errors one by one and see if they could be causing the problem. Today, I decided to finally write this troubleshooting process down and turn it into a flowchart in non-threatening colours.

Behold, the “my code isn’t working” step-by-step troubleshooting guide! Follow the arrows to find the likely cause of your problem – if the first thing you reach doesn’t work, then back up and try again.

StackOverflow is an essential resource for programmers. Whether you run into a bizarre and scary error message or you’re blanking on something you should know, StackOverflow comes to the rescue. Its popularity with coders spurred many jokes and memes. (Programming to be Officially Renamed “Googling Stackoverflow,” a satirical headline reads).

(image omitted)

While all of us are users of StackOverflow, contributing to this knowledge base can be very intimidating, especially to beginners or to non-traditional coders who many already feel like they don’t belong. The fact that an invisible barrier exists is a bummer because being an active contributor not only can help with your job search and raise your profile, but also make you a better programmer. Explaining technical concepts in an accessible way is difficult. It is also well-established that teaching something solidifies your knowledge of the subject. Answering StackOverflow questions is great practice.

All of the benefits of being an active member of StackOverflow were apparent to me for a while, but I registered an account only this week. Let me walk you t[h]rough thoughts that hindered me. (Chances are, you’ve had them too!)
…

I plead guilty to using StackOverFlow but not contributing back to it.

Another “intimidation” to avoid is thinking you must have the complete and killer answer to any question.

That can and does happen, but don’t wait for a question where you can supply such an answer.

Today I wanted to strace a JVM process to see if it was making network calls, and I discovered a minor roadblock: It was a Clojure program being run using the Leiningen build tool. lein run spawns a JVM subprocess and then exits, and I only wanted to trace that subprocess.

The solution is simple, but worth a post: Tell lein to run a different “java” command that actually wraps a call to java with strace. Here’s how I did it:
…

For the “…you never do know file…” and because it’s better to know than to assume.

When discussing functional programming we often talk about the machinery, and not the core principles. Functional programming is not about monads, monoids, or zippers, even though those are useful to know. It is primarily about writing programs by composing generic reusable functions. This article is about applying functional thinking when refactoring TypeScript code.

Fed up with a ton of tutorials but no easy way to find exercises I decided to create a repo just with exercises to practice pandas. Don’t get me wrong, tutorials are great resources, but to learn is to do. So unless you practice you won’t learn.

There will be three different types of files:

Exercise instructions

Solutions without code

Solutions with code and comments

My suggestion is that you learn a topic in a tutorial or video and then do exercises. Learn one more topic and do exercises. If you got the answer wrong, don’t go to the solution with code, follow this advice instead.

Suggestions and collaborations are more than welcome. 🙂

I’m sure you will find this useful but when I search for pandas exercise python, I get 298,000 “hits.”

As anyone who has tried Pokémon Go recently is probably aware, Pokémon come in different types. A Pokémon’s type affects where and when it appears, and the types of attacks it is vulnerable to. Some types, like Normal, Water and Grass are common; others, like Fairy and Dragon are rare. Many Pokémon have two or more types.

To get a sense of the distribution of Pokémon types, Joshua Kunst used R to download data from the Pokémon API and created a treemap of all the Pokémon types (and for those with more than 1 type, the secondary type). Johnathon’s original used the 800+ Pokémon from the modern universe, but I used his R code to recreate the map for the 151 original Pokémon used in Pokémon Go.
…

You will get some much needed rest, polish up your R skills and perhaps learn something about the Pokémon API.

The Pokémon Go craze brings to mind the potential for the creation of alternative location-based games. Accessing locations which require steady nerves and social engineering skills. That definitely has potential.

Say a spy-vs-spy character at a location near a “secret” military base? 😉

Posted in Programming, R | Comments Off on An analysis of Pokémon Go types, created with R

The speaker will discuss what he considers to be the most important outcome of his work developing TeX in the 1980s, namely the accidental discovery of a new approach to programming — which caused a radical change in his own coding style. Ever since then, he has aimed to write programs for human beings (not computers) to read. The result is that the programs have fewer mistakes, they are easier to modify and maintain, and they can indeed be understood by human beings. This facilitates reproducible research, among other things.

But each resource stands alone as its own silo. It can (and many do) refer to other materials, even with hyperlinks, but if you want to explore any of them, you must explore them separately. That’s what being in a silo means. You have to start over at the beginning. Every time.

That is complicated by the existence of thousands of slideshows and videos on programming topics not listed here. Search for your favorite programming language at Slideshare and Youtube. There are other repositories of slideshows and videos, those are just examples.

Each one of those slideshows and/or videos is also a silo. Not to mention that with video you need a time marker if you aren’t going to watch every second of it to find relevant material.

What if you could traverse each of those silos, books, posts, slideshows, videos, documentation, source code, seamlessly?

Making that possible for C/C++ now, given the backlog of material, would have a large upfront cost before it could be useful.

Making that possible for languages with shorter histories, well, how useful would it need to be to justify its cost?

And how would you make it possible for others to easily contribute gems that they find?

Something to think about as you wander about in each of these separate silos.

Mozilla has started publishing nightly in-development builds of its experimental Servo browser engine so anyone can track the project’s progress.

Executables for macOS and GNU/Linux are available right here to download and test drive even if you’re not a developer. If you are, the open-source engine’s code is here if you want to build it from scratch, fix bugs, or contribute to the effort.

Right now, the software is very much in a work-in-progress state, with a very simple user interface built out of HTML. It’s more of a technology demonstration than a viable web browser, although Mozilla has pitched Servo as a potential successor to Firefox’s Gecko engine.

Crucially, Servo is written using Rust – Mozilla’s more-secure C-like systems programming language. If Google has the language of Go, Moz has the language of No: Rust. It works hard to stop coders making common mistakes that lead to exploitable security bugs, and we literally mean stop: the compiler won’t build the application if it thinks dangerous code is present.

Rust focuses on safety and speed: its security measures do not impact it at run-time as the safety mechanisms are in the language by design. For example, variables in Rust have an owner and a lifetime; they can be borrowed by another owner. When a variable is being used by one owner, it cannot be used by another. This is supposed to help enforce memory safety and stop data races between threads.

It also forces the programmer to stop and think about their software’s design – Rust is not something for novices to pick up and quickly bash out code on.

…

Even though pre-release and rough, I was fairly excited until I read:

…
One little problem is that Servo relies on Mozilla’s SpiderMonkey JavaScript engine, which is written in C/C++. So while the HTML-rendering engine will run secured Rust code, fingers crossed nothing terrible happens within the JS engine.
…

Many newcomers to R got their start learning the language with Computerworld’s Beginner’s Guide to R, a 6-part introduction to the basics of the language. Now, budding R users who want to take their skills to the next level have a new guide to help them: Computerword’s Advanced Beginner’s Guide to R. Written by Sharon Machlis, author of the prior Beginner’s guide and regular reporter of R news at Computerworld, this new 72-page guide dives into some trickier topics related to R: extracting data via API, data wrangling, and data visualization.
…

Mobilize centers its curricula around participatory sensing campaigns in which students use their mobile devices to collect and share data about their communities and their lives, and to analyze these data to gain a greater understanding about their world.￼Mobilize breaks barriers by teaching students to apply concepts and practices from computer science and statistics in order to learn science and mathematics. Mobilize is dynamic: each class collects its own data, and each class has the opportunity to make unique discoveries. We use mobile devices not as gimmicks to capture students’ attention, but as legitimate tools that bring scientific enquiry into our everyday lives.

Mobilize comprises four key curricula: Introduction to Data Science (IDS), Algebra I, Biology, and Mobilize Prime, all focused on preparing students to live in a data-driven world. The Mobilize curricula are a unique blend of computational and statistical thinking subject matter content that teaches students to think critically about and with data. The Mobilize curricula utilize innovative mobile technology to enhance math and science classroom learning. Mobilize brings “Big Data” into the classroom in the form of participatory sensing, a hands-on method in which students use mobile devices to collect data about their lives and community, then use Mobilize Visualization tools to analyze and interpret the data.
…

I like the approach of having the student collect their own and process their own data. If they learn to question their own data and processes, hopefully they will ask questions about data processing results presented as “facts.” (Since 2016 is a presidential election year in the United States, questioning claimed data results is especially important.)

Hi! The Clojure Gazette has recently changed from a list of curated links to an essay-style newsletter. I’ve gotten nothing but good comments about the change, but I’ve also noticed the first negative growth of readership since I started. I know these essays aren’t for everyone, but I’m sure there are people out there who would like the new format who don’t know about it. Would you do me a favor? Please share the Gazette with your friends!

The Biggest Waste in Our Industry is the title of the essay I link to above.

From the post:

I would like to talk about two nasty habits I have been party to working in software. Those two habits are 1) protecting programmer time and 2) measuring programmer productivity. I’m talking from my experience as a programmer to all the managers out there, or any programmer interested in process.
…

Peopleware was first published in 1987, second edition in 1999 (8 new chapters), third edition in 2013 (5 more pages than 1999 edition?).

Twenty-nine (29) years after the publication of Peopleware, managers still don’t “get” how to manage programmers (or other creative workers).

Disappointing, but not surprising.

It’s not uncommon to read position ads that describe going to lunch en masse, group activities, etc.

You would think they were hiring lemmings rather than technical staff.

If your startup founder is that lonely, check the local mission. Hire people for social activities, lunch, etc. Cheaper than hiring salaried staff. Greater variety as well. Ditto for managers with the need to “manage” someone.

Posted in Clojure, Programming | Comments Off on Clojure Gazette – New Format – Looking for New Readers

The objective of this thesis is to evaluate the state of the art in formal methods usage in secure computing. From this evaluation, we analyze the common components and search for weaknesses within the common workflows of secure software construction. An improved workflow is proposed and appropriate system requirements are discussed. The systems are evaluated and further tools in the form of libraries of functions, data types and proofs are provided to simplify work in the selected system. Future directions include improved program and proof guidance via compiler error messages, and targeted proof steps.

The criteria for selecting a language for this work were expressive power, theorem proving ability (sufficient to perform universal quantification), extraction/compilation, and performance. Idris has sufficient expressive power to be used as a general purpose language (by design) and has library support for many common tasks (including web development). It supports machine verified proof and universal quantification over its datatypes and can be directly compiled to produce efficiently sized executables with reasonable performance (see section 10.1 for details). Because of these characteristics, we have chosen Idris as the basis for our further work. (at page 57)

Welcome to ClojureBridge CommunityDocs, the central location for material supporting and extending the core ClojureBridge curriculum (https://github.com/ClojureBridge/curriculum). Our goal is to provide additional labs and explanations from coaches to address the needs of attendees from a wide range of cultural and technical backgrounds.
…

Developers around the world are using these definitions in their work, and modern API tooling and service providers are using them to define the value they bring to the table. To help the API sector reach the next level, we need you to step up and share the API definitions you have with API Stack, APIs.io, or APIs.guru, and if you have the time and skills, we could use your help crafting other new API definitions for popular services available today.

The APIs.guru content is curated primarily by its creator, Ivan Goncharov. According to a DataFire Blog entry, the initial content was populated "using a combination of automated scraping and human curation to crawl the web for machine-readable API definitions."

Software Carpentry is having a Bug BBQ on June 13th

Software Carpentry is aiming to ship a new version (5.4) of the Software Carpentry lessons by the end of June. To help get us over the finish line we are having a Bug BBQ on June 13th to squash as many bugs as we can before we publish the lessons. The June 13th Bug BBQ is also an opportunity for you to engage with our world-wide community. For more info about the event, read-on and visit our Bug BBQ website.

How can you participate? We’re asking you, members of the Software Carpentry community, to spend a few hours on June 13th to wrap up outstanding tasks to improve the lessons. Ahead of the event, the lesson maintainers will be creating milestones to identify all the issues and pull requests that need to be resolved we wrap up version 5.4. In addition to specific fixes laid out in the milestones, we also need help to proofread and bugtest the
lessons.

Where will this be? Join in from where you are: No need to go anywhere – if you’d like to participate remotely, start by having a look at the milestones on the website to see what tasks are still open, and send a pull request with your ideas to the corresponding repo. If you’d like to get together with other people working on these lessons live, we have created this map for live sites that are being organized. And if there’s no site listed near you, organize one yourself and let us know you are doing that here so that we can add your site to the map!

The Bug BBQ is going to be a great chance to get the community together, get our latest lessons over the finish line, and wrap up a product that gives you and all our contributors credit for your hard work with a citable object – we will be minting a DOI for this on publication.

A community BBQ that is open to everyone, dietary restrictions or not!

And the organizers have removed distance as a consideration for “attending.”

For those of us on non-BBQ diets, a unique opportunity to participate with others in the community for a worthy cause.

A design document is where, before starting to implement a system, you write up a thing explaining what the system is supposed to do first and how you’re planning to accomplish that. I think there are basically two goals:

tell people what you’re doing

figure out design problems with the system before you’ve been coding for 2 months

I understand that it’s super important to think ahead a lot before huge projects, but a little bit of thinking can be helpful even for smaller projects. I asked some people recently if they write design docs for small projects and some of them said “yeah totally! small ones! it helps! :D”.

I used to get kind of grumpy when someone was like “hey julia can you write a design document for your system?” It would seem like a reasonable idea, though, so I’d try to do it! But the first couple of times I tried to write one I felt like it didn’t actually really help me! I liked the idea in principle, but I didn’t really know how to apply it and I felt like it was hard to get good feedback.

Last week I wrote a design doc and I thought it was sort of helpful. Here are some current thoughts.
…

Be forewarned that Julia is a gifted writer and you will enjoy her posts more than your design documents. 😉

Still, Julia makes a great case for the use of design documents (a/k/a “documentation”).

This ProPublica story by Julia Angwin, Jeff Larson, Surya Mattu and Lauren Kirchner, isn’t short but it is worth your time to not only read, but to download the data and test their analysis for yourself.

Especially if you have the mis-impression that algorithms can avoid bias. Or that clients will apply your analysis with the caution that it deserves.

Finding a bias in software, like finding a bug, is a good thing. But that’s just one, there is no estimate of how many others may exist.

And as you will find, clients may not remember your careful explanation of the limits to your work. Or apply it in ways you don’t anticipate.

Machine Bias – There’s software used across the country to predict future criminals. And it’s biased against blacks.

Here’s the first story to try to lure you deeper into this study:

ON A SPRING AFTERNOON IN 2014, Brisha Borden was running late to pick up her god-sister from school when she spotted an unlocked kid’s blue Huffy bicycle and a silver Razor scooter. Borden and a friend grabbed the bike and scooter and tried to ride them down the street in the Fort Lauderdale suburb of Coral Springs.

Just as the 18-year-old girls were realizing they were too big for the tiny conveyances — which belonged to a 6-year-old boy — a woman came running after them saying, “That’s my kid’s stuff.” Borden and her friend immediately dropped the bike and scooter and walked away.

But it was too late — a neighbor who witnessed the heist had already called the police. Borden and her friend were arrested and charged with burglary and petty theft for the items, which were valued at a total of $80.

Compare their crime with a similar one: The previous summer, 41-year-old Vernon Prater was picked up for shoplifting $86.35 worth of tools from a nearby Home Depot store.

Prater was the more seasoned criminal. He had already been convicted of armed robbery and attempted armed robbery, for which he served five years in prison, in addition to another armed robbery charge. Borden had a record, too, but it was for misdemeanors committed when she was a juvenile.

Yet something odd happened when Borden and Prater were booked into jail: A computer program spat out a score predicting the likelihood of each committing a future crime. Borden — who is black — was rated a high risk. Prater — who is white — was rated a low risk.

Two years later, we know the computer algorithm got it exactly backward. Borden has not been charged with any new crimes. Prater is serving an eight-year prison term for subsequently breaking into a warehouse and stealing thousands of dollars’ worth of electronics.
…

This analysis demonstrates that malice isn’t required for bias to damage lives. Whether the biases are in software, in its application, in the interpretation of its results, the end result is the same, damaged lives.

I don’t think bias in software is avoidable but here, here no one was even looking.

What role do you think budget justification/profit making played in that blindness to bias?

This class provides an introduction to computer programming for law students. The programming language taught may vary from year-to-year, but it will likely be a language designed to be both easy to learn and powerful, such as Python or JavaScript. There are no prerequisites, and even students without training in computer science or engineering should be able successfully to complete the class.

The course is based on the premise that computer programming has become a vital skill for non-technical professionals generally and for future lawyers and policymakers specifically. Lawyers, irrespective of specialty or type of practice, organize, evaluate, and manipulate large sets of text-based data (e.g. cases, statutes, regulations, contracts, etc.) Increasingly, lawyers are asked to deal with quantitative data and complex databases. Very simple programming techniques can expedite and simplify these tasks, yet these programming techniques tend to be poorly understood in legal practice and nearly absent in legal education. In this class, students will gain proficiency in various programming-related skills.

A secondary goal for the class is to introduce students to computer programming and computer scientific concepts they might encounter in the substantive practice of law. Students might discuss, for example, how programming concepts illuminate and influence current debates in privacy, intellectual property, consumer protection, antidiscrimination, antitrust, and criminal procedure.
…

The language for this year is Python. The course website, http://paulohm.com/classes/cpl16/ does not have any problem sets posted, yet. Be sure to check back for those.

Recommend this to any and all lawyers you encounter. It isn’t possible to predict who will or will not be a judge someday. Judges with a basic understanding of computing could improve the overall quality of decisions on computer technology.

An efficient computer set-up is analogous to a well-tuned vehicle: its components work in harmony, it is well-serviced, and it is fast. This chapter describes the software decisions that will enable a productive workflow. Starting with the basics and moving to progressively more advanced topics, we explore how the operating system, R version, startup files and IDE can make your R work faster (though IDE could be seen as basic need for efficient programming). Ensuring correct configuration of these elements will have knock-on benefits in many aspects of your R workflow. That’s why we cover them at this early stage (hardware, the other fundamental consideration, is covered in the next chapter). By the end of this chapter you should understand how to set-up your computer and R installation (skip to section 2.3 if R is not already installed on your computer) for optimal computational and programmer efficiency. It covers the following topics:

R and the operating systems: system monitoring on Linux, Mac and Windows

For lazy readers, and to provide a taster of what’s to come, we begin with our ‘top 5’ tips for an efficient R set-up. It is important to understand that efficient programming is not simply the result of following a recipe of tips: understanding is vital for knowing when to use a memorised solution to a problem and when to go back to first principles. Thinking about and understanding R in depth, e.g. by reading this chapter carefully, will make efficiency second nature in your R workflow.

…

Nope, go see Chapter 2 if you want the top 5 tips for efficient R set-up.

The text and code are being developed at the website and the authors welcome “pull requests and general comments.”