Jeremy Howardhttps://jphoward.wordpress.com
I really should work out what this is forTue, 03 Mar 2015 16:31:50 +0000enhourly1http://wordpress.com/https://s2.wp.com/i/buttonw-com.pngJeremy Howardhttps://jphoward.wordpress.com
Create a random sample using PowerShellhttps://jphoward.wordpress.com/2013/05/13/creare-a-random-sample-using-powershell/
https://jphoward.wordpress.com/2013/05/13/creare-a-random-sample-using-powershell/#commentsSun, 12 May 2013 23:20:25 +0000http://jphoward.wordpress.com/?p=150]]>Very often you will need a random sample of a file. This is really handy to quickly prototype script, before you run it on a really large file. Or, if you are just doing some statistical analysis, it is very likely that you won’t even need to run it on the full file at all. Therefore, I generally create 10% and 1% samples of any large files that I am working with correctly. When using Windows I find this easiest to do using PowerShell. Here is the command that I use (replace the ’10’ with ‘100’ to get a 1% sample):

]]>https://jphoward.wordpress.com/2013/04/18/hbr-visualization-webinar-data/feed/1jphowardIntermission–REST API in Python with Flask-Restlesshttps://jphoward.wordpress.com/2013/01/09/intermissionrest-api-in-python-with-flask-restless/
https://jphoward.wordpress.com/2013/01/09/intermissionrest-api-in-python-with-flask-restless/#commentsWed, 09 Jan 2013 00:34:59 +0000http://jphoward.wordpress.com/?p=132]]>In my End to end web app in under an hour tutorial I have been using C# for the backend and SQL Server for the DB. What if you’d rather use something else? Easy! For example, here’s how to port what we’ve done in parts 1-4 to Python. We’ll use the handy Flask-Restless library to create the API, along with Flask-Sqlalchemy to handle ORM duties for us. We’ll use sqlite as our DB in development, since it’s easy to get up and running. You should switch to something more appropriate in production (such as postgresql), although I won’t be covering that here.

That’s all you need! Try running that, and go to /api/todo_item in your browser. (Note that flask-restless turns CamelCase class names into underscore_separated names). You may also want to prepopulate your table with some data. I’ll leave that for you to do before moving on to the next section.

Updating the controller

Flask-Restless uses a somewhat different format for both its request and its response. Therefore we have to modify a few things in our controller (templates and directives however should not need to change). I created the following method in my ListCtrl, in order to simplify creating a request in the format that Flask-Restless expects:

Finally, I created a small method to call when creating the controller, or when changing sort order:

$scope.reset = function() {
$scope.page = 1;
$scope.search();
};

That’s basically all that is required to have this up-and-running in Python!

]]>https://jphoward.wordpress.com/2013/01/09/intermissionrest-api-in-python-with-flask-restless/feed/1jphowardEnd to end web app in under an hour–Part 4https://jphoward.wordpress.com/2013/01/05/end-to-end-web-app-in-under-an-hourpart-4/
https://jphoward.wordpress.com/2013/01/05/end-to-end-web-app-in-under-an-hourpart-4/#commentsFri, 04 Jan 2013 23:04:37 +0000http://jphoward.wordpress.com/?p=128]]>Here is the video tutorial that goes with this post:

Adding Indexes

Since we will probably sort frequently by the displayed columns, we should add indexes to them. However, our Todo column is currently unlimited in length, which will be tricky to index. So let’s first add an appropriate constraint in the TodoItem class:

[MaxLength(800)]
public String Todo { get; set; }

(NB: You’ll need “using System.ComponentModel.DataAnnotations” for this attribute.) Now run ‘Add-Migration CreateTodoIndexes’ in the console (where “CreateTodoIndexes” is the name of migration – use whatever name you prefer), and customize the Up() method to add the indexes.

You need to paste in the unique connection string provided by AppHarbor. You can get this by clicking ‘Configuration Variables’ by the list of AppHarbor addons, and clicking the copy button for the value of ‘SQLSERVER_CONNECTION_STRING’.

Finally, we need to ensure that Update-Database is called automatically as required, by adding this to the database context class (AmazingTodoContext):

If you now commit to GitHub and sync, AppHarbor will be notified of the commit and will automatically build and deploy your application. Each time you commit a new build and deploy will be kicked off, and status is displayed in AppHarbor.

The Hostnames link shows you the hostname of your running app.

Click it (and add ‘index.html’ to the url) to see your stunning addition to internet commerce!

You can get the code as at the end of part 4 from this GitHub link. Don’t forget to set the connection string and also to update your local database if you want to run the code from GitHub, rather than creating it yourself using the tutorial.

In the next part we will add make the application more resilient, by adding error handling and validation.

]]>https://jphoward.wordpress.com/2013/01/05/end-to-end-web-app-in-under-an-hourpart-4/feed/12jphowardimageimageimageimageimageimageEnd to end web app in under an hour–Part 3https://jphoward.wordpress.com/2013/01/05/end-to-end-web-app-in-under-an-hourpart-3/
https://jphoward.wordpress.com/2013/01/05/end-to-end-web-app-in-under-an-hourpart-3/#commentsFri, 04 Jan 2013 18:50:48 +0000http://jphoward.wordpress.com/?p=113]]>Here is the video tutorial that goes with this post:

The ‘control-group’ div needs to be repeated for each item. This could be made easier by writing a directive, but you know how to do that yourself now… The ng-model is showing something new here – the ability to create properties of an object on-the-fly. In this case, an object called $scope.item is being created.

The Cancel button simply redirects to the ‘/’ template. Note that all AngularJS internal links must start with ‘#’, in order to stay within the same server page. We need to add a route to app.js to allow us to go to this page:

when('/new', { controller: CreateCtrl, templateUrl: 'detail.html' }).

Create an empty controller called CreateCtrl for now, and add a TH to list.html to allow us to jump to the create page.

<th><a href="#/new"><i class="icon-plus-sign"></i></a></th>

Check the new form displays correctly.

To make this actually do something useful, add ‘ng-click="save()"’ to the Submit button, and add your save method to CreateCtrl.

Hopefully the general approach to this method is familiar from our earlier use of query() – we pass a second parameter, which is a callback called on success. We really should add a third parameter: a callback called on failure. I’ll let you tackle that yourself. The good news is that our Create form is now fully working! In the future we will add validation and also use a date-picker to make it easier to pick a due date.

Edit Form

We can use the same template for the edit form. But we’ll need a new route and a new controller that captures and displays the item to edit. The route:

Note that we’ve added $routeParams to our parameters in order to grab the id we captured from the url (i.e. where ‘:itemId’ appears in the route). Finally, let’s add an edit link as the final TD to each row.

I’ve used jQuery’s fadeOut() method here so that the user gets some feedback about the successful deletion. Strictly speaking, code that manipulates the DOM should really be in a directive, not a controller – but that seems like overkill for this single line.

It works!

So now we have a complete CRUD application. In the next part, we’ll learn how to add indexes using EF Migrations, and we’ll also learn how to make our app available to the public using AppHarbor. I hope you are ready for the fame and fortune that you will receive once your peers can use your amazing todo application!

]]>https://jphoward.wordpress.com/2013/01/05/end-to-end-web-app-in-under-an-hourpart-3/feed/7jphowardimageEnd to end web app in under an hour–Part 2https://jphoward.wordpress.com/2013/01/04/end-to-end-web-app-in-under-an-hourpart-2/
https://jphoward.wordpress.com/2013/01/04/end-to-end-web-app-in-under-an-hourpart-2/#commentsThu, 03 Jan 2013 23:23:33 +0000http://jphoward.wordpress.com/?p=109]]>Here is the video tutorial that goes with this post:

“ng-model” is perhaps the most important and useful AngularJS directive: it creates a 2-way binding between a property in $scope and the value of an HTML element. In this case, our text box’s value is bound to $scope.query. Furthermore, the Reset button will be disabled automatically if $scope.query is empty, due to the use of the ng-disabled directive.

All we need now on the client side is to define $scope.reset() in our controller.

Unfortunately, the query method that WebAPI creates for us does not support searching, sorting, or paginating. (Oddly enough, the pre-release versions of WebAPI did, but the functionality was stripped just before release!) Therefore, we will need to edit TodoController.cs to remove GetTodoItems(), and replace it with this:

(Although we are not using sorting or pagination yet, we may as well include it in our method for later.) The optional parameters to the method are automatically mapped to the querystring by WebAPI – so e.g. index.html?q=something will pass ‘something’ as the value of the ‘q’ parameter. $scope.reset() sets this parameter to $scope.query. So, we now have working sort functionality!

Pagination

Let’s now add pagination. That’s pretty simple actually. As you can see from GetTodoItems above, we can pass in an offset and a limit, so we just need to modify ListCtrl to only request 20 items at a time, and keep track of whether we have got all the items available (i.e. if we get less than 20 items in response, there is nothing more to retrieve). Note that we are now using the 2nd parameter to query(), which is a callback which is called on success. This allows us to append the additional items to the existing list.

The ng-show directive ensures that this link will not be shown when there is no further data (when show_more() returns false).

Sorting

In order to allow sorting, we’ll need to store sort order and direction in $scope, and then add to the Todo.query() params: sort: $scope.sort_order, desc: $scope.sort_desc . After adding those two parameters, be sure to initialize the order to whatever you prefer as the default.

$scope.sort_order = 'Priority';
$scope.desc = false;

Let’s now add a sort_by function that sets sort_order to whatever it is passed, and toggles the direction if it is called multiple times with the same order.

]]>https://jphoward.wordpress.com/2013/01/04/end-to-end-web-app-in-under-an-hourpart-2/feed/18jphowardimageEnd to end web app in under an hour–Part 1https://jphoward.wordpress.com/2013/01/04/end-to-end-web-app-in-under-an-hour/
https://jphoward.wordpress.com/2013/01/04/end-to-end-web-app-in-under-an-hour/#commentsThu, 03 Jan 2013 19:03:20 +0000http://jphoward.wordpress.com/?p=105]]>Here is the video tutorial that goes with this post:

Here’s how to create a complete web app in under an hour. We will (naturally!) create a todo app. The features:

As you’ll see, this particular choice of tools is well suited to rapid application development, and is also very flexible.

The goal is not just to throw together the minimal necessary to have something working, but to create a really flexible infrastructure that we can use as a foundation for many future applications. OK, let’s get started.

The Backend

In Visual Studio, create a new document and choose to create an MVC web application project.

Use the web API template.

Web API is a framework which makes it easier to create REST APIs. It is very similar to ASP.net MVC.

Delete HomeController.cs, everything in the Content folder, and everything except Web.config in the Views folder. (These are all for using ASP.Net MVC views and default styles, none of which we’ll need).

By default the API will use XML, but we would prefer JSON (this makes it a little easier to debug), therefore add this to the end of WebApiConfig.Register():

And now we’re ready to create our REST API! After compiling, right-click the Solution Explorer, and choose Add->Controller, and create a controller called TodoController.

You’ll need to choose the option to create a new data context.

You should now have a working REST API! Press F5 to run your solution, and it should open in a browser. You will get a “not found” error since we don’t have any pages set up yet, so you’ll need to modify the URL path to ‘/api/todo’.

Of course, at this stage all we get is an empty array. We need to put some items into our database! First, check out SQL Server Object Browser to see that Visual Studio has already created a DB for us:

To add items, we are going to use Entity Framework Migrations. We will go to Tools->Library Package Manager in order to open the Package Manager Console, which is where we can enter commands to work with Entity Framework. In the console, type “Enable-Migrations”.

This has created a file for me called Configuration.cs, which allows me to specify data to seed my DB with. Let’s edit that now to seed some data.

Any time I change my model or edit Seed(), I’ll need to run Update-Database in Package Manager Console to have the DB show my changes to the code.

Now I’ll refresh my browser to see the data via the REST API:

The Basic AngularJS setup

Now that the API is working, we can create a simple page to show a list of todos. We will use AngularJS as our Javascript MVC framework, so let’s install that: simply type “Install-Package angularjs” at the package manager console. We’ll be using Bootstrap to style things up, so install that too: “Install-Package Twitter.Bootstrap”. In the root of your project, create index.html, with references to the css and js files we’ll be using.

You will need to change the ng-app attribute in the html element, and the title, for each of your projects. Other than that, your index.html will be the same for most of your projects (other that having some different js and css links, of course). All the actual work will occur in AngularJS templates. In order for AngularJS To know what template and controller to use we need to set up some routes. To do this, we use the config method of the AngularJS module class. Let’s create a new JavaScript file for our AngularJS code; the convention is to call this app.js.

Any properties and methods of $scope are made available automatically in the template. In the template, use ‘handlebars’ (double braces) to indicate where AngularJS expressions should be placed. Don’t forget to add a script element to index.html pointing at your app.js. Now try going to http://localhost:5127/index.html in your browser (you’ll need to change the port of course). If it’s working, you’ll see “Test testing”.

$resource is a function provided by AngularJS that creates an object for accessing a REST API. We use a factory method so that we can reuse the object without it getting recreated every time. Note that we also had to add a parameter to ListCtrl in order to have access to this object.

At this point, you should be able to view the list in your browser.

It’s interesting to note that all the html files are entirely static – the only thing that’s dynamic is the JSON sent by the web API.

That’s the end of Part 1 of this walkthrough. In the next part, we’ll add sorting, searching, and pagination to the list.

]]>https://jphoward.wordpress.com/2013/01/04/end-to-end-web-app-in-under-an-hour/feed/41jphowardimageimageimageimageimageimageimageimageimage$40/month to send my email–are you serious?https://jphoward.wordpress.com/2011/11/08/40month-to-send-my-emailare-you-serious/
https://jphoward.wordpress.com/2011/11/08/40month-to-send-my-emailare-you-serious/#commentsTue, 08 Nov 2011 04:19:59 +0000http://jphoward.wordpress.com/2011/11/08/40month-to-send-my-emailare-you-serious/]]>The Kaggle web site needs to send emails from time to time – for example when confirming new users’ email addresses. Sending directly from a web server is not generally a good idea; even if you’ve taken the steps to set up DKIM, senderid, etc, you still have the problem that your IP isn’t a reputed email sender. It’s also likely that at some point some bad apple on your IP block will ruin the reputation for all their neighbors.

So instead, I decided to use a mail sending service. Here’s some examples, along with their pricing for their cheapest account:

Wo. That seems like a lot. Is it really so hard to send mail? Actually, no. Here’s a crazy option:

FastMail: $3/month (max 2,000 messages per hour -That’s up to 1,500,000 per month!)

And BTW that FastMail option also gives you a bunch of other stuff you may find useful (e.g. host and manage 50 domains, 2GB file storage, 10GB IMAP storage…)

I founded FastMail back in 1999 (I sold it to Opera a couple of years back and don’t work there any more) and worked hard to make the infrastructure efficient. However I’m surprised that folks building much more focussed tools today aren’t able to do it much cheaper. I know the focussed tools I’ve listed have some extra features (such as an HTTP API), but I don’t see why that should increase the unit price substantially.

]]>https://jphoward.wordpress.com/2011/02/21/kaggles-nick-gruen-on-abc-radio/feed/0jphowardVisualising Time Serieshttps://jphoward.wordpress.com/2010/10/18/visualising-time-series/
https://jphoward.wordpress.com/2010/10/18/visualising-time-series/#commentsMon, 18 Oct 2010 10:14:47 +0000http://jphoward.wordpress.com/2010/10/18/visualising-time-series/]]>Over at Kaggle there’s an interesting competition involving time series prediction. Since I’ve never done much with time series before, I figured I’d give it a go. It’s a good chance to learn something new, and have some fun in the process.

I decided to try a new (for me) approach to the analysis, which is to use general purpose programming tools for all the data analysis, including import/export, visualization, modelling, etc. My hypothesis was that with powerful languages which strong functional capabilities, I would be able to achieve results just as quickly as using a dedicated tool (like R), plus have the benefits of a “proper” programming language (e.g. strong language design, excellent IDE, speed, etc).

My first approach was to use Javascript to chart the 400-odd time series in each category (quarterly, and monthly). It turns out that it’s only about 10 lines of code, plus a cut-and-paste of a function from Google Charts docs:

The result is this page, which is a fast and easy way to see all the time series at once (click one of the buttons on that page to see the data). If you’re interested in seeing how it works, feel free to look at the JavaScript linked from that page.

Next, I moved to C#, and found that the functional capabilities added in .Net 3.5 (LINQ et al), and the automatic parallelization added in .Net 4, made it a real pleasure to work with. I also used GlowCode to profile my algorithms as I went, which made it easy to keep them running fast. I used the free Microsoft Chart components, plus a FlowLayoutPanel, to easily generate visualizations. For example, here’s a (subset of a) visualization showing in-sample predictions (blue) vs actual data (orange):

(click image to view full size)

In this example, it’s easy to see some models that aren’t ideal: series 2 shows that the underlying trend is not matching closely enough in this instance, and series 5 shows the problem of using additive seasonality in appropriately. You can see that adding the series number and a fitness metric to each chart makes it easier to work with.

Here’s a visualization showing out-of-sample predictions for a different model:

(click image to view full size)

In this case we can confirm visually that the models have reasonable-looking predictions.