Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

Stony Stevenson writes "Canadian researchers have promised to squeeze "decades" of cancer research into just two years by harnessing the power of a global PC grid. The scientists are the first from Canada to use IBM's World Community Grid network of PCs and laptops with the power equivalent to one of the globe's top five fastest supercomputers. The team will use the grid to analyze the results of experiments on proteins using data collected by scientists at the Hauptman-Woodward Medical Research Institute in Buffalo, New York. The researchers estimate that this analysis would take conventional computer systems 162 years to complete."

If you run it on a low level you can only increase your usage by about 1-2 and still help the project, there is no logical reason to run the client at 100% if it's going to cost you a bomb, where as at 1-2% you won't win any contests, but you will be helping the project and paying at most a buck or two extra on electric a month.

Personally, around our 3 PCs in a smoke-laden environment, I've only seen a {mobo-measured} temp increase of at most 4-5 degrees C {and usually only 2-3 degrees, on systems ranging from a PIII with XP Pro to a Athlon XP 2000+ dual-booting Ubuntu/XP Pro...}

BOINC seems to run a wee bit hotter on Ubuntu, but I've not benchmarked the two clients yet. I'm just guessing more efficient code allows for more ops per cycle meaning more CPU use and thermal waste, but that's all it is: a guess. Anyone else have any in

I'm not yet one of the climate change true believers.It just feels too much like an economic scheme based in pseudo-science and half truths. "The world is doomed, here's how you as a consumer can spend your way to salvation! Buy a new car and light bulbs filled with mercury!"

It's poster child, Al Gore, uses the word "if" too much. It's an old debating trick, to say "if X, then Y", and focus on the terrible consequence Y, and completely avoid the debate - which is over the validity/scope/level/definition

... Al Gore, uses the word "if" too much. It's an old debating trick, to say "if X, then Y", and focus on the terrible consequence Y, and completely avoid the debate - which is over the validity/scope/level/definition of X

I don't see it as a trick, but rather as being honest. Many of the "X" items aren't certain; it would be a lie to present them as such. But we can estimate the probability of X (based on the current state of knowledge), and explore the consequences if X *does* occur. Gore's argument is

Are you saying the the quote is false? As, the [Citation Needed] is linked in the post.
Are you saying that when you exhale, you don't release carbon into the atmosphere? If so, you need to take a biology class.
Or, are you saying that you can survive for hours without breathing? If so, I would be very impressed by seeing a demonstration.

The citation is the movie "An Inconvenient Truth". I even gave you a link to a clip of the movie that specifically states the quote. I don't know what you are babbling on about Wikipedia for. The video clip is absolute proof of the quote. If you think that the video is faked, then the only way you will believe it is if you source the video yourself. Just rent "An Inconvenient Truth", and watch the screen just before the ending credits. What would you accept as a valid citation for the quote if you won

No, I would not assume that "you" means your entire environment, and I do not accept the carbon footprint being the real meaning, as once we fall into the, mode of interpreting what we think he really meant, we might as well pull out the bible, and start looking for passages that we can interpret to mean that we should buy hybrid cars. Remember, this is supposed to be a college level seminar that is supposed to be using solid clear science to convince people of global warming. When you are dealing with sc

What's ridiculous about the debate is the supposed "corrective actions" are a step backwards if you really analyze them.Don't buy a Prius, it may get better mileage - though if you convert to gallons per mile - a true meter of energy cost, it doesn't look so good. Never mind the fact it runs on laptop batteries, which makes it a disposable vehicle at the end of the day.

Or go to a "super efficient" diesel engine. Well, there's reasons we restrict the numbers of diesels that can be put on the road, and that

This is what I hate about this climate change hype. Nowadays if you say that you want to do something for the environment, you're automatically assumed to mean that you want to prevent "global warming" or whatever. Personally I don't give a crap about climate change, I want mankind to pollute less and to use natural resources more efficiently because this results in an improvement in the environment. Sure, if mankind really is causing the climate to chang

Yep. It almost makes you wish that nobody had done anything about 2-digit dates so that January 1,2000 could have been a serious problem. That way you wouldn't have revisionist people denigrating the efforts put in to avoid the Y2K problem as a waste of resources. I think a lot more people would be willing to appreciate the potential risks of Global Warming if, among other glitches, their company's payroll systems had made it hard for them to get a paycheck in the first few months of 2000.

Can I run it so that speedstep/cool'n'quiet works? What I mean I do not want to run anything which increases the CPU frequency. Instead it should keep the CPU at lowest freq. Can this be accomplished?

Linux's CPU frequency scaler has this option. For example the 'conservative' governor has the file/sys/devices/system/cpu/cpu0/cpufreq/conservative/ignore_nice_load. So a program running with lower than default priority will not increase CPU frequency.

I use a script [iki.fi] to handle CPU frequency changes. When I'm at home with my laptop, I use the "ignore nice" option which in practice will turn the fan off. YMMV. When I go somewhere, I can set the CPU to full steam.

Definitely a better idea to use the internet for communication and to use electricity for things that benefit the household/office directly. I wouldn't be surprised if the cost in reduced years of life from increasing the pollution from running these distributed tasks outweigh the years of life extended by treating cancers.

It's easy to feel that way until someone in your family is diagnosed with cancer. Also, treating cancer does not just "extend life". There are a lot of younger people (20 to 40 years old) who get different forms of cancer. For them, it's not "will I live to 76 or will i live to 80?" but "will I live to see 30?". Don't even get me started on the kids who are afflicted with these diseases.

You sir are a candidate for the Fox 5 at 10 school of massively misjudging actual risks. Here's a hint: If you thought that pollution from using your computer was going to be SO great that it would dwarf the benefits of curing CANCER (a disease that was killing people a long time before we had global warming hysteria) then you should probably never: 1. use a fucking computer; 2. never destroy the "environment" by READING OR POSTING TO SLASHDOT!!!

Rather then turning on my heater these past few days (getting chilly at night in Houston, TX), I run the GPU Folding@home client on my PC. Seriously, it's not wasted energy if you want your home to be heated. You also participate in worthy cause to boot!

So unless you heat your home with electricity, which practically no one north of Florida does unless they have VERY cheap electrical power, you'll still be paying more by running computers.

I don't think anyone would disagree with you. The point that the parent post was trying to make is that a nice side benefit of running a distributed computing client like F@H is that the heat from your computers will help heat your home. Would anyone suggest running a bunch of quad-cores at 100% as a replacement for n

In university, I moved in with a roommate into a 'rear suite' (the street number was 669 1/2) which had recently been renovated, but which had also spent a great deal of time uninhabited. As a result, the utilities had been shut off, since no one was using them. 'Utilities' in this case, however, refers only to electricity, since in this area (Fredericton, NB), any heat sources other than electricity and oil (which would be hauled to your home in a tank truck) was unthinkable. Natural gas was 'too new' and

Meanwhile, since I live in Canada and by this time of year I do need heating, I have my boinc client running at 100%, I'm doing some good, and (since the peak capacity of the machine is justified in other ways) it's not costing a penny. The heating here is electric anyway; it may as well do some computation on its way into my home!

Doing whatever@home in the winter is just good sense.

Now what's needed is a distributed computing client that is controlled by a room thermostat. No, really, I'm totally serious

There are plenty of volunteer that give a huge amount of computing power. The first place volunteer has over 1.3 million results and several have more than 100,000. So there must be someone who believes in donating a lot of computing time. I can imangine how much time it would take just to install the program in the several thousand computers required to get that many results. The question is "Is it better to donate the computing time or is it better to donate the cost of the electricity needed by the c

Yeah, but they still have to gather all the research and organize it, the computer will be much faster than the human operators. Oh, and when this thing finally discovers that it doesn't need humans i would like to personally say that I humbly accept our new robot overlord.

I am serious wondering why they dint think of the PS3s. 700,000 PS3s recently subscribed to a network that ended up in Peta Flops peak performance. If I were managing this stuff, I would seriously take a look on that direction. Cell Processors are designed for such distributable tasks and they are very good at it.

I hope they're using programs that've had a few computer scientists' eyes over them. One of the issues I see with supercomputing is that people tend to see it as a way to get around dumb code(1) — if the computer's fast enough, you can implement *five* infinite loops, have an exponential time algorithm, and still get the calculations done before dinner!

Heh. Since a lot of the calculations are floating point, I think you're at least as likely to have numerical analysis errors that make the data come out of that loop be dominated by precision errors. But I think in a lot of cases, they do use optimized libraries (i.e. LINPACK) that do most of the math properly and limit the options for really dumb code.

I hope they're using programs that've had a few computer scientists' eyes over them.

Seeing as how the lead researcher [toronto.edu] holds M.Sc. and Ph.D. degees in Computer Science, is cross-appointed to the Departments of Computer Science and Medical Biophysics at the University of Toronto, and is a Visiting Scientist with IBM's Center for Advanced Studies in Toronto...

...it seems likely that a computer scientist may have cast his eyes over the code once or twice.

Okay, not that I'm knocking how cool this grid computing is, but that estimate of 162 without grid computing couldn't possibly be taking into account the acceleration of computing power. Maybe with today's computers it would take 162 years, but after the first couple of years just get a new computer and cut the time in half.

Which reminds me of how towards the end of my grad school career I did hours long simulations that would have taken weeks at the beginning of grad school. I was in grad school a long time:(

The same could be said for life expectancy: right now the average North American life expectancy is around 70-something. I wouldn't be surprised if--when I'm in my late 60s, that the life expectancy will have increased to 80-something or even 90.

Given the increase in obesity across the population, I expect average North American life expectancy to decrease. However, for the subgroup that can maintain a healthy diet and a good exercise balance, I think average life expectancy will go up to the range you are talking about.If you want to live longer with a good quality of life, eat a healthy balanced diet, make sure you don't let your body fat percentage get too high, and find a low-impact aerobic exercise that you enjoy and can continue to do as you

Cool, I was having the same thought recently. Depressed about how long it's taken me to finish school and really start life it occurred to me that an average life span of 70 is only for people born 70 years ago when medical technology was crap compared to today. Which allows me to put off saving for retirement (or acting like an adult) without feeling bad.

We're computer scientists. We can calculate these kinds of things. Protein folding calculations take a ridiculous amount of time and processing power. That's a reflection of how complex your dna is, not a reflection of how much processing power we have at our disposal. If we could borrow from the computing power of the future, then you might be right. But the fact remains, we only have what's at our disposal now. At the current state of computing technology, the calculations would take 162 years.

My comment wasn't some kind of proposal or solution and was in no way saying that this grid computing isn't a great thing. I was merely making the observation that it's dumb to consider only today's computing power and then come to the conclusion that a calculation will take 162 years, regardless of what the calculation is about. It will obviously take a much shorter time than that since the computers crunching the numbers will occasionally be upgraded.

But there isn't just raw computing power in play here. There is also the IO requirements, memory requirements, etc. That's the beauty of grid computing--by distributing the load you can increase the the throughput of the entire system, not just an individual component.

There is a theory for grad students doing computational simulations that they might as well do nothing the first two years, and then perform all calculations in the last few years, without losing time. Also, this is 162 years for a single core, in reality, problems like this will be done on a parallel machine.

That said, just as 'cancer research' is a way to get easy funding, 'grid computing' is not much more. The theory is very nice, work on a machine anywhere in the world from your own desktop without ha

Moore's law doesn't help us right NOW. If I promise you ten bazillion dollars in 2025, that doesn't help you buy even a stick of gum today.

Unless, of course, you'd like to stick to the realm of theoretics, in which case I postulate that cancer doesn't exist and neither do you, and by a solid application of Finagle's law I'm about to take a hatchet to my left hand. Do you see my point?

So, you seem to be complaining that the (evil) biopharmaceutical companies are greedy and want money and this is wrong... unless you can have a slice of it too? I think you need some sort of levee around your moral high ground, buddy.

But do we see a chunk of the profit that they'll be making off the cancer drugs they make from this data that OUR computers analyzed and then is eventually sold to us for too-high-to-afford prices?

The research is being done by scientists at Princess Margaret Hospital in Toronto, a government run hospital. If you knew anything about health care in Ontario you'd know that profit is the last thing on their mind.

You are one hundred percent correct.
We should NOT be contributing our precious *cough*unused*cough* CPU cycles to evil, money grubbing governmental institutions purely so they can further get better profits.
No cure for a disease which causes 13% of all deaths [who.int] is worth that, not unless I see some money for using my precious CPU cycles!

Every time these "connect desktops to become the fastest computer in the world" articles
come up, I have to dust off my
Cluster Urban Legends [clustermonkey.net] article to clear up the mis-conceptions that abound. I also
did a piece [linux-mag.com] on the Linux Magazine site as well that debunks much of the spam-bot supercomputer legend (need to register for that one)

The computers participating in the grid project are not just "desktop" computers. The ones connected from my alma mater were the ones that were maintaining thousands of X-Sessions across campus, on all the library machines and in all of the labs in dozens of buildings, supporting a student population of 40,000 students. Not the same as getting the spare cycles from someone's entertainment system or personal computer.

I'm not talking about spare cycles. I'm talking about the naive notion that
gets repeated in the press "the combined power of all these computers equals
one of the fastest supercomputers in the world" For trivial parallel applications
this might be true, but just once I would like to see these "supercomputers"
run a simple parallel benchmark like High Performance Linpack (used for the Top500 list).
My guess is the number of real FLOPS would be much less than expected -- if
it even finished. Don't get me wrong, using computers like this is great idea, it is
not one of the most power computers in the world, however.

Seriously, if you can break up a task into small chunks and process it faster than some computer can, WTF difference does it make if it fits your definition of some benchmark or other. Did the data get processed? (_) Yes (_) No Who cares if YOU define a supercomputer a certain anal way and decide it isn't fastest under XYZ criterion.

You and Tom from Tom's Hardware should get together and chew the fat about your benchmarks.

I'm not talking about spare cycles. I'm talking about the naive notion that gets repeated in the press "the combined power of all these computers equals one of the fastest supercomputers in the world"

If you know so much about the topic, why aren't you at Stanford telling Dr. Pande and his group that they are wasting their time with all those desktops and PS3's? I'm sure Dr. Pande would love for you to point out how his research would be much better off if he'd just go buy some time on aupercomputer.

I would like to see these "supercomputers" run a simple parallel benchmark

But the thing is, these clusters are not made for running benchmarks, but for real (and specialized) calculations. My home server processes data for the World Community Grid and I see that the client is silently numbercrunching for a few hours and then communicates for a few seconds (at the amazing speed of about 50 kB/s). And for this, actual usage the grid shows a performance that could only be replicated by a powerful supercomputer

The parent doesn't even understand the kind of work being done on these grids. The work is broken up very methodically in such a way that it can be worked on for a long time and communicate for a short time. In addition, the estimate of computing power is not 100% of all the computers connected. It's what percentage of the connected computers power that they actually donate. These grid projects actually keep data to prove this.

Every time these "connect desktops to become the fastest computer in the world" articles come up, I have to dust off my Cluster Urban Legends article to clear up the mis-conceptions that abound. I also did a piece on the Linux Magazine site as well that debunks much of the spam-bot supercomputer legend (need to register for that one)

Too bad you're wrong in this case, since protein folding is embarrassingly parallel. How do you think Folding@home works?

I'm very glad to help cancer research, but will this also result in the development of drug patents that (a) bankrupt some patients, and (b) prevent other researchers from improving on those drugs?

I agree with you, in principle (that it's just not fair for you to gain nothing), but isn't donating your CPU time still the best solution? I mean, it's not as though there's some choice you could make that would likely lead to a better outcome for you.

I'm very glad to help cancer research, but will this also result in the development of drug patents that (a) bankrupt some patients

The alternative is "don't help the distributed computing project" and those drugs will never be 'discovered'. Then, instead of being poor and alive, the patients will be wealthy and at room temperature.

I'm very glad to help cancer research, but will this also result in the development of drug patents that (a) bankrupt some patients, and (b) prevent other researchers from improving on those drugs?

It it makes you feel better, the bioinformatics team is being led by a Canadian researcher out of a Candian institution (the Ontario Cancer Institute at Princess Margaret Hospital, jointly with the University of Toronto). In Canada, chemotherapy drugs are provided to patients free of charge, and pricing is cont

"The researchers estimate that this analysis would take conventional computer systems 162 years to complete."They're always saying, "We've knocked decades off of our work by using the right tool for the job." That's like me saying I knocked decades off of the calculations to run an energy minimization on a hexane molecule by running it on my Core 2 Duo instead of my Atari 800.

I mean, let's face it. They weren't going to let the friggin' program run for 162 years. The problem became solveable when the hardware became available. Hell, within 5 years, that "conventional computer system" will be able to solve it in a fraction of that 162 years and 5 years later, a fraction of that. So what do you do? You wait until the hardware meets up with ability to solve the problem. They haven't saved decades. They probably haven't even saved a decade. Within a decade they'd probably be able to run it in a few days on a conventional computer.

You wait until the hardware meets up with ability to solve the problem.

So, if I'm following you correctly, you want the medical researchers to stockpile all the research projects that have "heavy computing demands" until Intel comes out with their 128-core CPU? What do we do in the meantime? Just sit around say "Oh jeez, sorry we don't have a treatment for your leukemia. But in ten years, we are going to launch a computer program that will have an answer for us after running for just thirty days!"?

So, if I'm following you correctly, you want the medical researchers to stockpile all the research projects that have "heavy computing demands" until Intel comes out with their 128-core CPU?No, you're not following me correctly. My point is, nobody is going to run a program that's going to take decades to run. Instead, they're going to run some scaled down version that approximates a solution or there going to find some other method to solve the problem. When the computing power is available to run it in a

Yes, or they could do it RIGHT NOW and save 17 years. (Actually, the sweet spot is 12 years away, since it would then take 2.5 years to run for a total of 14.5 years, and 14 would still take 1.25 years for a total of 15.25 years. So they'd save 13.5 years if they could run it in 1 year on today's computers.) While that's not -decades- it IS over a decade. Do you know how many people die of cancer in a decade?http://www.medicalnewstoday.com/articles/37480.php [medicalnewstoday.com] Apparently there's about 550,000 people die

World Community Grid [worldcommunitygrid.org] is making [this] technology available only to public and not-for-profit organizations to use in humanitarian research that might otherwise not be completed due to the high cost of the computer infrastructure required in the absence of a public grid. As part of our commitment to advancing human welfare, all results will be in the public domain and made public to the global research community.

WCG uses the Berkeley Open Infrastructure for Network Computing (BOINC) client, an open source software project that runs on Linux, Mac and Windows. Headline should read Open Source Software Cures Cancer;-)

BoincStats [boincstats.com] shows you who is contributing to World Community Grid projects. Check it out...and ask yourself why you aren't contributing.

We "the people" run the software and pay the millions of dollars of hardware and electricity costs. When the problem is solved the University patents everything (thank you suckers) and licenses the technology for for a small fortune to some back stabbing Megacorp (TM) drug company. So when "we the people" get sick we have the wonderful knowledge that we have paid twice for the ripp-off drugs. So all things being fair, if you want my cpu spare time I want a part of the license fees to pay for the drugs that cost a house when I get sick.

I know this research, and the people involved in it very, very well, and I think this project is a very sad, very large waste of computing time.

Let me back up and explain what the project is doing. To simplify a little bit, the vast majority of "work" in the cell is done by proteins. While DNA can be thought of as something like a simple "string", proteins have complex three-dimensional shapes. Knowing those 3D shapes is of great interest to biologists. There are several reasons for that. One is that it can allow easier design of drugs targeted at a specific part of the protein. Another is that by seeing the shape, we can understand how all the mutations that occur in disease might be affecting its function.

The primary way to determine the shape of the protein is to take the protein and to grow it into an ordered crystal. You can then shine an x-ray beam through the crystal, and the diffraction pattern that emerges can be, through some very complex math, reverse-engineered into a 3D structure. Typically the most difficult part of this process is finding the specific chemical conditions that will allow a crystal to grow. These conditions differ from protein to protein.

This project is not "solving cancer", by any means. Rather, the people in Buffalo have generated a high-throughput way of screening different chemical conditions to determine which ones might allow a protein to grow. They use robotics to screen about 1000 conditions, and take pictures of each condition. The question then becomes: can you automatically process the pictures to find crystals. That's the goal of this project, to help automatically identify crystals in this screen.

So why do I object so strongly to this work? There are three reasons.

First, the project has nothing to do with cancer. In fact, the proteins being analyzed are not in any way "cancer-specific proteins" -- many of them are not even human!! This "cancer" pitch is a sales job, and nothing but a sales job. As a cancer researcher, it offends me that people try to use the disease to justify research that is this unrelated.

Second, the project is ill-conceived, technically. In no way did the group in question (Igor Jurisica's lab, in Toronto) carefully select a machine-learning approach to identify good ways of analyzing images. Instead, they have just selected something like 1000 different techniques, and are running *all* of them on every image they have. It's a fishing expedition, with the hope that one of those thousand metrics they return will be a useful predictor.

Third, the techniques selected are basically arbitrary. Most egregiously, there appear to be NO Fourier transforms included in the analysis!! Further, the images generated by the software appear to be transforms of something called "gray level cooccurrence matrices", and the computation of those can be estimated in no more than five minutes. So why are they taking 5 hours per unit? It appears that they have chosen to implement an exhaustive GLCM search that is an order of magnitude slower, rather than using existing estimation procedures that are ~98.5% accurate. Is that an excuse to use more computer time? Is there any scientific merit to that? Why aren't Fouriers included, since they are a standard technique for image analysis?

I have a number of computers that I run various BOINC projects on, but this will NEVER be one. It's a fishing expedition, being sold as cancer research, and that is a sad way to deceive the public.

Given that most proteins contain tryptophan, and tryptophan fluoresces under UV, and UV lasers are not that hard to come by, wouldn't it be easier to shine a UV laser at the crystallisation plate and detect by subtraction where the glowy bit is?Or, as a lot of molbio automation companies are offering, actually shine an X-ray beam through the putative crystal onto a detector and see if it diffracts.

I'll start off by saying that I know little more about x-ray crystallography than what you explained in your post. My concern with your objection is, however, more related to your criticism. I understand your distaste for the project's underhanded tactics in trying to generate publicity. Beyond that, however, your criticisms fail to address the merits of what the group IS doing (other than what I perceive as your criticism of high-throughput screening in general). If you feel that your technical critici

In no way did the group in question (Igor Jurisica's lab, in Toronto) carefully select a machine-learning approach to identify good ways of analyzing images. Instead, they have just selected something like 1000 different techniques, and are running *all* of them on every image they have. It's a fishing expedition, with the hope that one of those thousand metrics they return will be a useful predictor.

Not quite. The machine learning bit comes second. You have to spend the CPU cycles to extract features from the images first. Only then can your favourite ML technique tell you if the features are predictive. The first ~1000 features (already computed, locally) show some promise, and that's why this project will explore the image feature space a bit more (~12000 features). Once we get Grid results back from our human-scored image set, any features that are a clear waste of time will be dropped.

Third, the techniques selected are basically arbitrary. Most egregiously, there appear to be NO Fourier transforms included in the analysis!!

Here is a more complete story: between changing compilers, moving from the development platform to the target platforms, and identifying some redundant computation in one corner of the algorithm, we were able to reduce the run-time from about six hours to five minutes. This allowed us to undo some rather brutal compromises (accuracy for speed) we had made in a previous stage of development, when we thought the analysis was running unacceptably long for Grid purposes.The extra hours are not busy work.

Question to slashdotters: I am wondering... Would you accept to run a distributed app if you didn't know what it did (let's say the developers want the purpose of the app to remain secret) but if there was some kind of competition with money prizes for, say the top-100 CPU time contributors ? Such as $5000 for the 1st, $1000 for the next 4 and $500 for the next 95.

(Of course I assume some would be tempted to reverse engineer the distributed app, because of pure curiosity).

I noticed that you omitted using linebreaks in an effort to save energy.

Seriously, I asked you to provide a source that links distributed computing to global warming and you threw a bunch of numbers together. Please provide a credible source which documents the link between PCs and global warming. Thanks.