You should have signed up for class notes or questions (as described on the assignments page)

You should have signed up for search industry updates (as described on this page).

The structure of today's class is going to be fairly standard for the rest of the semester:

Start off with announcements and taking questions and comments.

Lecture for a bit. This lecture will be something of an overview and will provide the motivation and background for the exercises that you will complete and the assignments that you will have to work on.

Provide some time for you to start on your exercises (which will allow you to explore the specifics of the topic that my lecture introduces).

Resources

03 Wiki Instruction Day

We'll go over techniques and tricks related to using this wiki, which will also be the host of your term project wiki.

At beginning of class

See if there are any questions about the assignment that students got last class.

Note that students should check the announcements on the Web site on Sunday evening, Tuesday evening, and Thursday evening. Shouldn't take more than a minute or two.

Students have their next assignment today: Search tool overlap data. This is due a week from today. Explain the assignment.

My notes

Today we're going to learn about wikidot, get an idea of how to use it, get more familiar with working with a wiki.

Wikidot is the host of the course Web site, but it's also going to be the host of your term project Web site.

You are a member of the course Web site, and you will be the administrator of your own term project site. This means that, while you have total and complete control over your own site, you also have the ability to edit and create pages (but not delete them) within this course Web site.

I will expect that you will be a very good user of this wikidot site by the end of the semester. Maybe you won't be an expert, but you'll be able to make a wiki that is filled with properly formatted content and useful navigation.

My notes

Special search syntax — This is the tool that you have at your disposal that allows you to target your searches on specific parts of documents. Since different text in different parts means different things and perform different functions, you can use these operators to raise the precision of your queries.

Full text search engines

Title — intitle:

Site — site:

Top-level domain — site:

URL contents — inurl:

Links — link:

Unique words and phrases — The use of multiple unique words and phrases are a key both to reducing the number of documents that are retrieved and raising the precision of your queries. Further, using multiple words and phrases increases the chances of retrieving content-filled documents (that is, increasing the number of “meaty” documents).

They can be used to focus in on more specialized pages that would use those terms

At end of lecture

Start working on today's exercises. The exercises are on this page. You should work on them for no more than another hour outside of class; we will have more time in the next class after the lecture to continue working on them before going on to that day's exercises.

If you are late turning in today's assignment, you still should go through the effort of posting the information to the results page — the analysis assignment that you will be doing depends on having this information.

If you are going to write a blog related to today's exercises, be sure to review the blogging guidelines before doing so.

Class structure

At beginning of class

So, my wife has liver cancer. Either benign (hemangioma) or malignant. We don't know yet. We've voted for benign. She'll be having an MRI in the next couple of days (we hope). Just wanted you to know.

You should know that I send email when I have something to say to a specific student between classes. For example, if I have a something to tell you about your assignment specifically or an upcoming industry update or whatever, I will send you an email. The email will have BIT330 at the beginning of the subject line. For this reason, you might want to check your email no less than every couple of days to see if you have an email from me. More than likely, you won't — but you might.

If any of you wrote a blog entry, please turn it in now. Generally, if you have written a blog entry, just bring it up to the podium for me before class starts.

We now have some information on the notes page and questions page. Even if people are not signed up for specific days on the class notes page, I would still recommend that you post notes and questions. The more good information that is on these pages, the more that I will be able to use this information on the tests, and the better that you will do on the test — as opposed to me coming up with some random, poorly-worded question that you have to guess on.

I have significantly updated the write-up of the next assignment. It is due on September 29. Do not start working on this yet. The data still needs to be cleaned up a little bit. I will make an announcement on the wiki when the data is ready. Further, I will post an Excel spreadsheet containing the data.

On day 8, which is September 29, you are turning in the first status report for your term project. On that date, you need to have decided on the topic, you need to have discussed your topic with me, described the topic on the start page of your wiki, and updated your information (that is, indicated the title of your wiki) on the class wiki's list of student wikis.

Another note for your term project. Your term project reports will include a section on information sources (as we will discuss today). Part of this will be an evaluation of the quality of the information sources that you identified. You will want to describe how you evaluate the sources, and indicate on the report your evaluation of each one of them. This will not be a separate deliverable but should be integrated into the final report.

There are so many blog opportunities from these two classes (i.e., today and Monday). If you want to blog on both classes, you don't have to choose something from “last class” and then something from “this class”. These are both the same topic; you can use any two things you want to blog on from both classes. It doesn't matter if they were the same or different days. (Again, you don't have to blog today, or last class. But you'll have to blog sometime, and you might as well start sooner rather than later.)

On day 8, which is September 29, you are turning in the first status report for your term project. On that date, you need to have decided on the topic, you need to have discussed your topic with me, described the topic on the start page of your wiki, and updated your information (that is, indicated the title of your wiki) on the class wiki's list of student wikis.

Topica appears to have changed their focus. They certainly don't advertise that they have than email discussion list directory — - but they do. You can search either the list description or the messages themselves.

I will return blogs next class. Why? Two reasons: 1) I want to have more turned in before I assign these grades. 2) I want you to have entered this information into SiteMaker so that I can test the grade recording process.

I'll go over

What this class should be like so far

Any questions about this class so far this semester? Where we're going? Anything at all?

My notes

The Internet is changing all of the time. New resources are being added at a phenomenal pace in millions of different sites. You can't keep up with everything on your own. You need help.

It's all about getting computers to work for you, to work while you're not using it. Use the computer to search through information so you don't have to. Use the computer to deliver information to your email inbox or to a specific Web page so you don't have to go get it. You don't have to remember to do the query.

You still have to define the search. You probably have to spend more time up-front when defining the query.

All blog entries, industry updates, notes, and questions are points out of 10.

I have several printed blogs for which you have not entered information yet.

A “9” grade on a blog is what I would call a “normal, high-quality, well-written, informative blog entry.” A “10” means that you exceeded this standard. Your entry was somehow more informative, more insightful, more engaging (don't discount this — I very much welcome reading an interesting well-written entry with a good story integrated into it) than my expectations.

If you get a “10” on a blog entry, I want you to copy your blog entry from your wiki to my wiki. Create a page with the same name (i.e., “blog:XXX”) but it should be in the class blog. Do this as soon as you see your grade. Thanks.

I thoroughly enjoyed meeting with most of you last week and discussing your term projects. I'll be checking your wiki home page in the version as of the beginning of class.

Go over any other announcements you might have missed since last class.

Check the start page for any blogs you might be interested in that you might have missed.

From now on you don't have to “turn in” blogs with a piece of paper. You simply need to record the page within the SiteMaker page. I will be able to see that the page doesn't have a grade and will grade it (with a goal of the next class for the rest of the semester).

Class structure

At beginning of class

On your own

Be sure to have all of your information entered under “Basic/Grades”. Otherwise, I can't record your grades!

Do not leave class today if your information is not entered into the SiteMaker site.

Personal

My wife's MRI came back with less-than-encouraging news.

She now has a biopsy scheduled for next Monday during class. I should be with her.

So we're not having class next Monday.

Check who is doing class notes for today. I had to change this a bit for this class and next Monday because of the previous item.

Assignments

Because of the turmoil of yesterday in my family, I didn't get much grading done.

I hope to have more graded soon.

I hope to discuss the search tool data analysis assignment next class.

Go over any other announcements you might have missed since last class.

Check the start page for any blogs you might be interested in that you might have missed.

From now on you don't have to “turn in” blogs with a piece of paper. You simply need to record the page within the SiteMaker page. I will be able to see that the page doesn't have a grade and will grade it (with a goal of the next class for the rest of the semester).

Possible blog entries

There are two possible blog entries related to this class — you can write one, both or neither of these. But I would find these interesting.

Write a blog entry on what you observed, what you learned and found interesting, focusing on information that other students might find useful.

Go talk to a Ross librarian. Tell them your topic and ask what 3 to 5 databases or tools that you might find most useful given that topic. See what databases they might tell you to focus on. Use them for a while. By the end of the semester, write a blog entry describing how the information you find in these databases differs from what you would find in the Web at large or what you found in the Deep Web search tools we were introduced to above.

BTW, I would find it rather remarkable if you didn't have in your term project a section or group of resources or something related to information a person could get in a library's database (compared with Deep Web and the Web itself).

Articles

As summarized by the editor of The Journal of Electronic Publishing: "Michael K. Bergman, whose BrightPlanet company offers a new approach to search engines, examines the wealth of information that is available only on dynamically created Web sites, those that don't exist except as relational databases until someone seeks information from them. As more sites adopt the dynamic approach to pages, they are creating a challenge for standard search engines. This article looks at some alternatives."

There are 40 possible points here. I looked for responses related to meaning, recommendations, learning, and further questions. I read everyone's responses, and grouped like responses together while ranking them. I then assigned points from 15 to 37 depending on the relative quality and completeness of your responses.

Go over any other announcements you might have missed since last class.

Check the start page for any blogs you might be interested in that you might have missed.

From now on you don't have to “turn in” blogs with a piece of paper. You simply need to record the page within the SiteMaker page. I will be able to see that the page doesn't have a grade and will grade it (with a goal of the next class for the rest of the semester). Be especially certain to do this before Fall Break.

Email filtering

A powerful method that can be applied to Email alerts is using “plus addressing” service when you sign up for an Email alert (e.g. from some query), i.e. tell them that your address is dummy+moc.liamg|reifitnedIyreuQemos#moc.liamg|reifitnedIyreuQemos instead of the normal address moc.liamg|ymmud#moc.liamg|ymmud. Thus, if you get this address to your mail account, you can filter it by what comes after the plus! This is a extremely helpful since it makes it easier to filter emails.

Project wiki clarification

You should do the following for your project wiki:

You should figure out some way that you are going to document the email alerts that you use in your email account to route your incoming alerts. Maybe print the alert page to a PDF file and link it to your wiki? Maybe take a screenshot of your email inbox and highlight the email alerts?

In either case, you are going to want to have a section in your wiki called "Email alerts".

On this page you should describe each of the email alerts that you used: the page from which you subscribed to it, why it is useful, and if there are any keywords (or such) that you used to generate it.

My notes

Page monitoring software

Overview

Page Monitors were the next big thing five years ago. It is a program or web based program that you download. Each day (or whatever time period you want to set) it downloads the webpage, and if it's different it will send you an email. Some tell you what has changed while others just tell you that it has changed.

At first, you might not be that impressed with page monitors. But after realizing that it can be used for a lot more than news, it can be quite a useful tool. WatchThatPage.com is the best free site.

WatchThatPage has a limit of 250 characters for the URL. Also, shortened URLs (from tinyurl.com) do not work. To get around these problems, use TrackEngine, where neither of these problems exist.

Capabilities

Automatically determine if a Web page, or part of a Web page, has changed

Make a feed

From a page

Description: Dapper is pretty slick. You can look through user created Dapps or you can (easily) create your own. Don’t forget to use the “get a nice short url” option and create your own that is easier to look at/use. This allows you to get an RSS feed for more things (instead of just news and blogs) such as searches.

Feed43 is a little bit more complicated. You have to find the actual html within the source code of the page.

Define Extraction Rules – By finding the specific places (within the code) of the information that you’re looking to have monitored by the RSS feed. There are directions for what specific code to use in the program.

Then click extract

Then you can give it a title, description, url, etc

Then put in where the title, date, etc are etc

If these sites are updated once a month, its too much of a hassle to make one of these (use a page monitor). But if it is updated daily and you want to monitor it, then it might be a good idea to make one!

From other feeds

FeedRinse: From their site, “Feed Rinse is an easy to use tool that lets you automatically filter out syndicated content that you aren't interested in. It's like a spam filter for your RSS subscriptions.”

Email alerts — This was a precursor to RSS feeds. Some sites will give you updates to their site via email, not RSS.

Hints about possible test questions

You're definitely going to be held responsible for the following topics:

What WatchThatPage (as an example of a page monitor) can do

What Dapper can do

What Feed43 can do and how its search patterns work

What Yahoo Pipes can do and how feeds can be manipulated (for example, Fetch Feeds, Union, Filter, Sort)

Under what circumstances would you use each one of these tools (as opposed to another)

I'll add to this later but this should give you an idea of the type of questions that I might ask.

Possible blog topics

You do not have to write a blog. These are suggested blog topics if you were to write one. There are lots of possibilities in this class. Describe different ways that you found these tools useful. Describe how you used Yahoo Pipes, possibly differently than how we have described them here.

General market size information

The following series of charts shows the relative traffic for several social news and/or bookmarking sites. The first chart shows that digg has about 7x the traffic of reddit. I am using these two sites as benchmarks for the two charts after that. (Each chart can only compare five sites.) In the first chart, digg is the benchmark, while in the second chart reddit is the benchmark. Again, remember that digg has about 7x the traffic of reddit.

The second chart shows the relative traffic of digg, stumbleupon, slashdot, propeller.com and fark. Here you can see that digg is approximately 3x the size of the next busiest sites, stumbleupon and slashdot. Propeller and fark are no more than 40% of the size of those smaller sites.

The third chart shows reddit as the clear size leader, though mixx (which only came onto the scene in any reasonable sense in May 2008) is growing quickly and relatively steadily. Newsvine is showing some signs of growth as well. Both dzone and sphinn have maintained a relatively steady amount of traffic for the last year.

The fourth chart shows traffic comparisons for sites that are more focused on social bookmarking than news. There has been a clear growth in traffic over the last year with CiteULike and Diigo leading the way. Over the same period of time simpy has lost traffic, ma.gnolia has maintained, and Connotea has grown significantly.

The fifth chart shows the relative traffic of the two largest social news sites (digg, stumbleupon), the largest social bookmarking site (delicious), and a smaller, but growing, social news site (diigo) — which can barely be seen at the bottom axis.

Resources

Social News sites

General news

Digg (about, tour, search): “Digg is a place for people to discover and share content from anywhere on the web… We’re here to promote that conversation and provide tools for our community to discuss the topics that they’re passionate about.”

Reddit (about, search): “reddit is a source for what's new and popular on the web — personalized for you. Your votes train a filter, so let reddit know what you liked and disliked, because you'll begin to be recommended links filtered to your tastes.”

StumbleUpon (about, video intro, guide, search): “StumbleUpon helps you discover and share great websites. As you click Stumble!, we deliver high-quality pages matched to your personal preferences… This helps you discover great content you probably wouldn't find using a search engine.”

Propeller (about, tour, search): “Propeller is a social news portal, meaning that it is programmed by you – the audience. Our members post links to stories from all over the Web… Once the link has been posted, you can vote on it, comment on it, share it with friends, or bookmark it to read later.”

Fark (about, help, search): “Fark.com, the Web site, is a news aggregator and an edited social networking news site… The idea was to have the word Fark come to symbolize news that is really Not News.”

Mixx (about, tour, search): “You find it; we'll Mixx it. Use YourMixx to tailor the content categories, tags, specific users and groups, and we'll deliver the top-rated content as chosen by you and people who share your passions. So go ahead and whip up your own version of the web. Just tell us how you like it Mixxed and we'll deliver the best the web has to offer”

NewsVine (welcome, help, search): “At Newsvine, you can read stories from established media organizations like the Associated Press and ESPN as well as individual contributors from all around the world. Placement of stories is determined by a multitude of factors including freshness, popularity, and reputation. Contribution is open to all, and editorial judgement is in the hands of the community.”

Technology focus

Sphinn (about, help, search): “Sphinn is a social site for search and interactive marketers. It's designed to allow you to share and discover news stories, read and take part in discussions, discover events of interest and network with others.”

For researchers & scientists

Connotea (about, guide, search on homepage): “Saving references in Conntoea is quick and easy. You do it by saving a link to a web page for the reference, whether that be the PubMed entry, the publisher's PDF, or even an Amazon product page for a book. Connotea will, wherever possible, recognise the reference and automatically add in the bibliographic information for you.”

CiteULike (help, search on homepage): “CiteULike is a free service to help you to store, organise and share the scholarly papers you are reading. When you see a paper on the web that interests you, you can click one button and have it added to your personal library. CiteULike automatically extracts the citation details, so there's no need to type them in yourself.”

Social Bookmarking sites

Delicious (about, video tutorial, help, getting started, search): “Delicious is a social bookmarking service that allows you to tag, save, manage and share Web pages all in one place. With emphasis on the power of the community, Delicious greatly improves how people discover, remember and share on the Internet.”

Furl ([ about], help, video tour, uses of furl, search): “Furl is a free service that saves important items on the Web, allowing quick retrieval for future access. Furl archives a personal copy of every page and provides a search service the full text of all archived items. Each Furl member has a personal archive of 5 gigabytes (GB), large enough to store tens of thousands of searchable items.”

Magnolia (tutorial, search): “How is Ma.gnolia different from other social bookmarking services? It starts with making the social side of social bookmarking work better. With contacts, groups and different ways to share bookmarks both within and outside of Ma.gnolia, we make working together on a casual basis or more formal projects fun and easy.”

Simpy (about, help, search on homepage): “Simpy is a social bookmarking service. With Simpy, you can save, tag and search your own bookmarks and notes or browse and search other users' links and tags. You can be open and share your links with others, or keep them private. Simpy also helps you find like-minded people, discover new and interesting sites, publish your bookmarks, detect and eliminate link-rot, etc.”

Class structure

At beginning of class

No new grades.

The exam is one week from Monday (i.e., November 10).

You should review the questions that have been posted so far, add to them as necessary, revise them as you see fit, and add pages for questions from days that don't have questions. (Do this by clicking on the link for the appropriate day.) Your motivation for this is that more questions that you have seen before might be on the exam. All of this posting of/revising questions must be done by midnight at the end of a week from this Friday.

The exam will be multiple choice, true/false, short answer, and essay.

You will be responsible for my lectures, for my in-class demonstrations, for your exercises, for student industry/company updates, and for anything that I have written on the Web site.

Post any questions you might have about the exam on the class forum under “Assignments.”

Class structure

At beginning of class

My wife had another MRI last night at 3:40am, this time on her spine. It went fine — just took a long time.

No new grades

The exam is next Monday (i.e., November 10).

I have arranged for a speaker from Google to come to class on Monday, November 24. He is the same speaker I had last year, and he did a fantastic job. I'm very much looking forward to his day in class.

What can you learn about your Web site

Yahoo tools

The Yahoo SiteExplorer allows you to explore all the web pages indexed by Yahoo! Search. View the most popular pages from any site, dive into a comprehensive site map, and find pages that link to that site or any page.

19 Test

All material other than your pencil/pen should be along the walls of the classroom, away from your desk. Turn off your cell phone and all electronics; put them in your possessions along the walls. Put all hats and coats with your possessions as well.

You have 1 hour, 10 minutes to complete the test. It's longer than I thought it was going to be but it shouldn't be a time problem for you to finish it.

Read the front of the test when you get it but do not open it until I say.

Travel

Google Sightseeing: "Google Sightseeing takes you on tour of the world as seen from satellite, using the free Google Earth program, or Google Maps in your web browser. Each weekday your guides James and Alex present new weird and wonderful sights as suggested by readers."

TripBase: "Tell us what you like. We'll tell you where to travel." (review)

Entertainment

Brewster Jennings Protects America: "Remember playing "Where in the World is Carmen Sandiego" as a kid? Well now the new game Brewster Jennings Protects America brings this classic adventure into the 21st century by merging the game play with Google maps technology*. In the web-based Brewster Jennings Protects America game you race around the globe as a government agent trying to stop a deadly terror attack from taking place…. "

World Sunlight Map: "Watch the sun rise and set all over the world on this real-time, computer-generated illustration of the earth's patterns of sunlight and darkness. The clouds are updated every 3 hours with current weather satellite imagery."

Class structure

At beginning of class

At my request, we're going to have a visit from Nigel Melville to help explain his very exciting new class for BBAs that he's teaching next semester: BIT/MKT 378 “Service Innovation Management”. The following video relates to the MBA version of this class (which he taught last year).

Grades

I graded the tests.

Possible points: 108

Range: -3 to -64

Median: -11.5

At the end of class I'm handing them out; we'll go over them; you'll give them back to me before you leave this classroom.