1. Summary

We discussed plans related to setting up our new, more robust hosting set-up.

Unfortunately, board member NathanKennedy didn't show up, so we couldn't declare it an official board meeting.

2. Decisions

We settled on DavorOcelic, JustinLeitgeb, and MichaelOlson as the new team of system administrators, along with AdamChlipala transitioning to only the role of author and maintainer of custom administration software, but no day-to-day support request handling. The new team will lead the planning of our new hosting infrastructure, but we'll keep our current admin team until that's ready.

We chose Hurricane Electric as our next colo provider, pending any new information that might come up. You can find the quotes given to use by them and other providers on ColocationProvidersEvaluation.

We decided that our new hosting plans require purchasing the following hardware: one server that all members are allowed to log into, one server accessible only to admins, a network-accessible serial console, and a switch. More detail on Hardware.

3. Action Items

We need to figure out what models/configurations of hardware to buy and where to buy it! Everyone is encouraged to research the options and report them on OurHistory/HardwareAppraisal.

We've created a new mailing list, hcoop-sysadmin. Its purpose is to host quite technical discussions on how we should configure our servers. It's open to everyone, not just official sysadmins, and the intention is that we continue to use this list throughout HCoop's lifetime, keeping hcoop-discuss more for social or less technical discussion. Everyone with an interest in this stuff is encouraged to join by subscribing himself via the Portal Preferences page!

We've scheduled our next meeting (hopefully an official board meeting!) for exactly a week after this one: Saturday, July 1, 2006 at 18:00:00 UTC. Hopefully research on hardware options will be complete by then, and we can make some definite choices about what to buy.

4. IRC Log

Jun 24 11:20:47 <Smerdyakov> OK, the appointed time is here.
Jun 24 11:21:01 <Smerdyakov> How about we start with an official announcement of presence from everyone who's here and watching?
Jun 24 11:21:03 <Smerdyakov> Hi!
Jun 24 11:21:15 <iriefrank> no its not healthy
Jun 24 11:21:35 <Optikal_> I am here.
Jun 24 11:21:41 <iriefrank> Frank Bynum here
Jun 24 11:21:46 <leitgebj> Hi all.
Jun 24 11:21:47 <jch5> this is John Hallgren from Cape Cod MA
Jun 24 11:21:48 <Smerdyakov> And deactivate off-topic conversation for the duration. :)
Jun 24 11:22:20 * Smerdyakov pings docelic.
Jun 24 11:22:30 <bpt> i am here
Jun 24 11:22:51 >docelic_< Ready to start?
Jun 24 11:22:59 <docelic_> Yep we can start
Jun 24 11:23:20 <Smerdyakov> OK. I think the first thing is to decide officially who is in charge of being in charge of things. ;)
Jun 24 11:23:21 <docelic_> (Davor Ocelic here)
Jun 24 11:23:44 <bpt> <- Brian Templeton
Jun 24 11:23:44 <Smerdyakov> ntk isn't here, so this doesn't count as a board meeting per our bylaws, but docelic and I can still make official decisions by majority vote.
Jun 24 11:24:12 <Smerdyakov> We have our admin volunteers listed here: http://wiki.hcoop.net/wiki/AdminVolunteers
Jun 24 11:24:44 <Smerdyakov> ntk volunteered himself half-heartedly in case we didn't find enough folks. What do we think? Is 3 sysadmins a good number, or do we want 4?
Jun 24 11:25:52 <jch5> from a lowly user POV, if 4 is possible, I'm appreciate it...to cover all time slots and backup/vacation
Jun 24 11:25:53 <docelic_> Do we want to make a complete decision now, or a decision of "3 now, with one slot open" is good enough?
Jun 24 11:25:54 <Smerdyakov> We're making do now with, effectively, 2, where one (me) doesn't really want to be doing day-to-day sysadmin stuff.
Jun 24 11:26:25 <leitgebj> I think that if it was a half-hearted self-nomination, we are OK with 3 until someone comes along who is interested. ntk is doing a lot with the co-op without taking on more responsibilities, unless he really wants them. I am also confident that someone will come along in short time who is willing to do the job.
Jun 24 11:26:29 <Smerdyakov> jch5, the counterbalancing issue is that more people with root access degrades security.
Jun 24 11:26:53 <iriefrank> three is fine with me, i dont think we need an official declaration of how many sysasmin "slots" there are either
Jun 24 11:27:06 <docelic_> I'm a little surprised we couldn't come up with 4. There's been a number of people who expressed their interest over time :(
Jun 24 11:27:07 <Smerdyakov> Incidentally, I'm also suggesting that I retain root access as the person in charge of software infrastructure... AKA maintaining domtool and related stuff.
Jun 24 11:27:25 <Smerdyakov> So that would be 4 with docelic, leitgebj, mwolson, and me.
Jun 24 11:27:46 <leitgebj> Smerdyakov, I think that sounds like a solid plan.
Jun 24 11:27:54 <docelic_> Sounds good.
Jun 24 11:27:55 <Smerdyakov> iriefrank, there are administrative things like whom to make sure is present at meetings about sysadmin stuff. :)
Jun 24 11:27:58 <jch5> 10-4
Jun 24 11:28:34 <Smerdyakov> OK, so it sounds like for now we'll assume the roster of sysadminesque people I just stated.
Jun 24 11:29:19 <Smerdyakov> Now I think we should turn to choosing our next colo provider: http://wiki.hcoop.net/wiki/ColocationProvidersEvaluation
Jun 24 11:29:41 <Smerdyakov> HE seems to have the most attractive deal at present. Any objections to taking them as our intended provider going forward?
Jun 24 11:30:09 <leitgebj> I also think that he.net is the best option.
Jun 24 11:30:30 <leitgebj> docelic, how do you feel?
Jun 24 11:31:45 <docelic_> I wasn't much into data collection for this round
Jun 24 11:31:52 <docelic_> I've read all there was on the subject earlier today
Jun 24 11:31:59 <Smerdyakov> docelic, you can see on their page that they have the lowest price. ;)
Jun 24 11:32:17 <docelic_> and I heared of HE.net before. We considered them in the previous round too, or I heared of them from some other place ?
Jun 24 11:32:52 <Smerdyakov> I don't know.
Jun 24 11:33:48 <Smerdyakov> OK... I don't see a lot of deliberation. docelic, are you pondering this now, or are you just yielding to our conclusion? :)
Jun 24 11:34:16 <docelic_> I was just doing inquiries, but I agree with going on with HE
Jun 24 11:34:33 <docelic_> My point was that, if it wasn't in our previous round, then I heared good things about HE from other places as well
Jun 24 11:35:08 <Smerdyakov> OK. That leaves us to figure out exactly what we want to host there.
Jun 24 11:35:36 <Smerdyakov> Whose natural home is http://wiki.hcoop.net/wiki/SystemArchitecturePlans
Jun 24 11:35:52 <docelic_> Are we looking to move abu and fyodor there ?
Jun 24 11:36:02 * voider (n=Is@modemcable169.254-80-70.mc.videotron.ca) has joined #hcoop
Jun 24 11:36:16 <Smerdyakov> docelic, we don't own fyodor, so I think that's a "no" there.
Jun 24 11:36:26 <Smerdyakov> docelic, Abu is something we haven't decided on yet.
Jun 24 11:36:29 <docelic_> Well I mean in a software sense
Jun 24 11:36:35 <voider> hi guys
Jun 24 11:36:36 <Smerdyakov> docelic, eh?
Jun 24 11:36:40 <docelic_> hello voider
Jun 24 11:36:40 <voider> i am late?
Jun 24 11:36:56 <voider> *am i*
Jun 24 11:37:06 <docelic_> voider: not much, we agreed on admin structure (leitgebj, mwolson, adam and me), and to go with HE.net for next provider
Jun 24 11:37:06 <Smerdyakov> voider, not so late. We decided on who the sysadmins would be, decided to go with HE as our next colo provider, and now we discuss system architecture to colocate.
Jun 24 11:37:26 <docelic_> Smerdyakov: I didn't mean "moving" in sense of relocating hardware, but services
Jun 24 11:37:29 <voider> ok good
Jun 24 11:37:49 <metaperl> what about Mike_L as admin?
Jun 24 11:37:50 <Smerdyakov> docelic, it has been my assumption the whole time that all of our primary services would move to the new servers.
Jun 24 11:37:58 <Smerdyakov> metaperl, he didn't add himself to the candidate list.
Jun 24 11:38:09 <voider> i'll mostly act as a spectator, I don't know enough about the way you operate hcoop yet
Jun 24 11:38:15 <Smerdyakov> metaperl, and I think he doesn't have much interest in the level of time commitment that is appropriate.
Jun 24 11:38:15 <docelic_> metaperl: just like Clinton didn't
Jun 24 11:38:27 <docelic_> Smerdyakov: ok then
Jun 24 11:39:05 <Smerdyakov> The SAP wiki page lists three servers: file server, member accessible server (the scary zone!), locked-down server (where we keep as much as we can).
Jun 24 11:39:35 <Smerdyakov> I think of all of these as limited to "Internet hosting" stuff only.
Jun 24 11:39:54 <Smerdyakov> We could also consider (space permitting) to move Abu to the new colo place and make that a generic "shell server" for whatever.
Jun 24 11:40:06 <docelic_> ok
Jun 24 11:40:07 <Smerdyakov> We could even buy a new server for that, but this seems a good use for something we have around anyway.
Jun 24 11:40:31 <Smerdyakov> But I consider that something to do after the main stuff is up and running smoothly.
Jun 24 11:41:14 <Smerdyakov> Does anyone object to the very high-level proposed configuration for the main 3 servers?
Jun 24 11:41:28 <Smerdyakov> Or have any suggestions at that level before we drill down..
Jun 24 11:42:32 <docelic_> I think it's OK. One thing that wasn't mentioned, and it should have been as it's the same logical level as AFS fileserver, is centralized login DB
Jun 24 11:43:03 * mwolson (i=mwolson@jpi-wlafyte-213-192.dmisinetworks.net) has joined #hcoop
Jun 24 11:43:06 <docelic_> Very probably using LDAP
Jun 24 11:43:15 <docelic_> mwolson: short summary:
Jun 24 11:43:19 <Smerdyakov> Yes. I've always associated that with AFS in my CMU days. I think it was based on Kerberos; that is, I know Kerberos was used, but I'm not sure if there was something else more fundamental.
Jun 24 11:44:01 <mwolson> oh ... darnit
Jun 24 11:44:04 <docelic_> mwolson: admins: leitgebj, mwolson, adam, docelic. Next provider: HE.net. System structure OK as proposed on http://wiki.hcoop.net/wiki/SystemArchitecturePlans
Jun 24 11:44:09 <leitgebj> Right, I was thinking of Kerberos as well, even though I haven't used it before. I have used ldap a bit.
Jun 24 11:44:42 <docelic_> Well, AFS has something like kerberos built-in for authentication, but you can plug it into existing kerberos setup if you have one
Jun 24 11:44:54 <Smerdyakov> I know Kerberos and AFS also work well with geographically distributed networks... so it could scale well to future set-ups where we branch out into multiple main locations.
Jun 24 11:45:15 <docelic_> So AFS would imply using kerberos, and somehow by nature of things, kerberos would imply LDAP
Jun 24 11:45:29 <Smerdyakov> Maybe. I never used LDAP in my experiences in college.
Jun 24 11:45:45 <Smerdyakov> That is, I never did something where it was obvious that the LDAP protocol was involved.
Jun 24 11:45:55 <Smerdyakov> Anything could have happened behind the scenes.
Jun 24 11:46:21 <docelic_> I've read the O'reilly book on LDAP and played with it last month, in a more serious way than before
Jun 24 11:46:42 <mwolson> i've set up LDAP before at my LUG
Jun 24 11:47:54 <mwolson> i'd be willing to do so here, if we decide on it
Jun 24 11:48:19 <Smerdyakov> I think we should pick a main person in charge of the purchase and set-up of the new systems.
Jun 24 11:48:36 <Smerdyakov> I anti-nominate me because I do a lot of stuff already, AND I don't know much about this stuff. ;)
Jun 24 11:48:53 <docelic_> I volunteer for the software setup, as usual
Jun 24 11:49:16 <Smerdyakov> What about hardware elements?
Jun 24 11:50:02 <Smerdyakov> *a pin drops*
Jun 24 11:50:08 <leitgebj> I am in the process of buying a house right now, and that may complicate things.
Jun 24 11:50:48 <Smerdyakov> Well, I know I'll screw something up if _I'm_ in charge of it!
Jun 24 11:50:48 <leitgebj> I am willing to help, but I imagine that what will happen is that we will have the servers shipped to a physical location, plugged into a network, e.g., dsl connection. then they are configured and sent to the colo provider.
Jun 24 11:51:14 <Smerdyakov> leitgebj, there are serious issues of just _which_hardware_ to buy, and I think you expressed opinions there in the past.
Jun 24 11:51:17 * Optikal___ (n=optikal@pool-70-18-225-250.res.east.verizon.net) has joined #hcoop
Jun 24 11:51:37 <leitgebj> Right, that's another stage of the planning that seems to have not been settled fully.
Jun 24 11:51:44 <leitgebj> Maybe we should start there.
Jun 24 11:51:55 <leitgebj> How much money do we have, or have access to?
Jun 24 11:51:58 <Smerdyakov> That's what I was talking about: appointing someone in charge fo that stage.
Jun 24 11:52:22 <leitgebj> I volunteer to lead that part of the process.
Jun 24 11:52:51 <Smerdyakov> Here we can see how much we theoretically will commit to spending per month: https://members.hcoop.net/portal/poll?id=8
Jun 24 11:52:57 <jch5> i'd just ask to not overbuy and spend extra $$$ unneeded
Jun 24 11:53:18 <voider> I have a question, dunno if it has been discussed before: how will all this cost? How will it affect the montly payment?
Jun 24 11:53:31 <voider> ah
Jun 24 11:53:32 <docelic_> voider: look at the URL Smerdyakov pasted
Jun 24 11:53:36 <Smerdyakov> I get $317 dollars considering the lower bounds of people who voted.
Jun 24 11:53:58 <Smerdyakov> voider, we will use sliding scale payments to make sure there is no serious increase for people who don't feel HCoop is worth investing more.
Jun 24 11:54:21 <voider> shit, I don't have my password here
Jun 24 11:54:29 <voider> and I forgot it :/
Jun 24 11:54:38 <voider> I'll check that link when I return back to home
Jun 24 11:55:01 <jch5> in my case, it's not that it's not worth it...just that I have a very limited budget for my site
Jun 24 11:55:11 <Smerdyakov> I was somewhat surprised by the scarcity of high-end answers to that poll. :\
Jun 24 11:55:24 <Smerdyakov> Lots of people not willing to spend as much as they do for cable TV. :P
Jun 24 11:55:56 <docelic_> We need to figure out new ways to increase hcoop service value
Jun 24 11:56:16 <voider> personnaly I'm not in a situation that allow me to spend much for my hosting
Jun 24 11:56:23 <Smerdyakov> leitgebj, there are issues to consider of "do we want to take out loans?" and "do we want to have some loan-like structure arranged amongst ourselves?" and "do we expect a number of members to make one-time donations for hardware that will never be paid back?".
Jun 24 11:56:29 <mwolson> bah, who would pay money to see mostly commercials ...
Jun 24 11:56:39 <jch5> I've got $6 a month for all my site expenses...
Jun 24 11:56:49 <Optikal___> $50 a month is outrageous for web hosting, especially non-commercial
Jun 24 11:57:01 <Smerdyakov> jch5, that's what's always confused me about your situation. What's so bad about paying out of your own pocket?
Jun 24 11:57:06 <mwolson> (referring to cable TV)
Jun 24 11:57:14 <Smerdyakov> jch5, I pay out of my own pocket for a number of groups whose web sites I host.
Jun 24 11:57:36 <Smerdyakov> jch5, but that's distracting now and perhaps interesting to bring up later.. :)
Jun 24 11:57:54 <Smerdyakov> Optikal_, $50 a month is a negligible investment for the average young IT hotshot. :P
Jun 24 11:58:05 <Optikal___> I don't think of money in relative terms
Jun 24 11:58:13 <jch5> it's designed to be self supporting by users.. i provide the brains...well most
Jun 24 11:58:51 <Smerdyakov> leitgebj, I hope that answered your question. ;)
Jun 24 11:59:28 <voider> someone can tell the big lines of the above pasted link please?
Jun 24 11:59:29 <leitgebj> OK, that is fine. There is something that I've been thinking of in relation to the system architecture, though. It seems that we are just over the point where one system suffices. Can we consider buying two solid servers to host at he.net? That should be more than sufficient for quite a while, if we are smart about building them.
Jun 24 11:59:30 <Optikal___> and this isn't exactly an investment, all the returns go to other hcoop members.
Jun 24 11:59:38 <voider> you're talking about 50/month ?
Jun 24 11:59:55 * voider scratches his scrotum
Jun 24 12:00:04 <Smerdyakov> leitgebj, that might be sensible. Would you think of combining "file server" and "locked-down server"?
Jun 24 12:00:31 <leitgebj> Perhaps. I guess I didn't get that far yet! ;)
Jun 24 12:00:34 <Smerdyakov> voider, the most popular answer was that people are willing to pay $6-$10/mo. in the short term while we build membership and pay off loans.
Jun 24 12:00:53 <Smerdyakov> leitgebj, because there are many nice consequences of having a server where only admins can log in.
Jun 24 12:01:19 <voider> ok
Jun 24 12:01:19 <leitgebj> Smerdyakov, certainly, I agree that we should have one machine with only admin logins.
Jun 24 12:01:50 <mwolson> that machine had better be doing other things as well though :^)
Jun 24 12:02:04 <Smerdyakov> leitgebj, well... it seems only natural that the file server should be on that machine instead of the one where users go wild running their programs. :)
Jun 24 12:02:16 <leitgebj> Smerdyakov, I agree.
Jun 24 12:02:39 <Smerdyakov> As long as we keep a clean upgrade path to dedicated fileserver mode..
Jun 24 12:03:15 <docelic_> Technically we could group together 1) and 3). (fileserver, and other services used through protocols and not shell)
Jun 24 12:03:23 <Smerdyakov> Anyone else's thoughts on leitgebj's idea?
Jun 24 12:03:48 <docelic_> It would reduce bandwidth too. Like, no need for mail server to lay data on a fileserver that's remote
Jun 24 12:03:56 <Smerdyakov> leitgebj, also, I wouldn't really say that we've outgrown a single machine... rather that a certain level of redundancy is important for reliability.
Jun 24 12:04:08 <Smerdyakov> docelic, but it would be local bandwidth only. *shrug*
Jun 24 12:04:40 <Optikal___> Are most the reliability problems at the machine level rather than the network level?
Jun 24 12:04:45 * voider has quit (Read error: 104 (Connection reset by peer))
Jun 24 12:04:57 <docelic_> Smerdyakov: yeah.. still. Well the price would be most visible factor in this condensed setup. And admining would be a little easier as well
Jun 24 12:05:10 <docelic_> Why have 2 machines with admin-logins only when there can be one
Jun 24 12:05:24 <docelic_> I kind of like the 2-in-1 idea actually, in this phase at least
Jun 24 12:05:25 <mwolson> docelic_: yeah, it would definitely be best to have fileserver and mailserver on the same machine; i know NFS at least does not like writing to many small files
Jun 24 12:05:35 <Smerdyakov> docelic, compute-bound and disk-bound servers have very different criteria for good hardware specs... that's one reason.
Jun 24 12:06:27 <docelic_> Smerdyakov: right. However, I estimate buying one is still better, as you can save on prt of specs that would be underused if there was 2 separate systens
Jun 24 12:06:34 <Smerdyakov> But I agree that price concerns trump that sort of thing today, as long as we can meet a certain performance threshold to be determined.
Jun 24 12:06:39 <docelic_> part* and systems*
Jun 24 12:07:19 * voider (n=Is@modemcable169.254-80-70.mc.videotron.ca) has joined #hcoop
Jun 24 12:07:20 <docelic_> With modern AMD or Intel processors, we can do anything
Jun 24 12:07:44 <Smerdyakov> docelic, for instance, file server might be more reliable with a complicated RAID mode that slows down ordinary operation.
Jun 24 12:07:44 * Optikal_ has quit (Read error: 110 (Connection timed out))
Jun 24 12:08:03 * TheDebugger (i=TheDebug@modemcable135.111-81-70.mc.videotron.ca) has joined #hcoop
Jun 24 12:08:12 * TheDebugger (i=TheDebug@modemcable135.111-81-70.mc.videotron.ca) has left #hcoop
Jun 24 12:08:55 <voider> oops, lost connection
Jun 24 12:09:10 <Smerdyakov> OK... so have we magically shifted to agreeing on a two-server main set-up?
Jun 24 12:09:20 <docelic_> Smerdyakov: you are right. But it comes down to estimating what's good enough, and as far as Im concerned, 1 system setup could work in this phase.
Jun 24 12:09:33 <docelic_> With a physical room to grow at HE, we can always split later
Jun 24 12:10:01 <Smerdyakov> It would be nice to have an automated system for switching static web sites between servers in an emergency.
Jun 24 12:10:27 <docelic_> Between which servers ?
Jun 24 12:10:30 <leitgebj> Smerdyakov, I was thinking about that, too... actually would be pretty easy to make it work with dynamic sites as well, in the future.
Jun 24 12:10:41 <Smerdyakov> The big pain I've encountered with fyodor is ulimits and related stuff.
Jun 24 12:10:58 <Smerdyakov> They tend to interfere with system services, at least when a novice like me sets things up
Jun 24 12:11:06 <Smerdyakov> Which is why I think two servers is worth the effort.
Jun 24 12:11:15 <docelic_> Smerdyakov: the ulimits thing is working pretty well now that it settled down , doesn't it ?
Jun 24 12:11:39 <Smerdyakov> docelic, maybe you figured out the right way to restart services so they aren't killed for forking, but I'm still not sure. :)
Jun 24 12:13:20 <leitgebj> I think that there are lots of reasons to have two systems. In an extreme case (hardware on one machine dies), we could also recover in less time by switching services to the other box. Currently, we would have to buy a machine, configure it, and send it out. That is a lot of time that the coop would be down for.
Jun 24 12:14:27 <docelic_> Smerdyakov: there was a discussion in Debian about it, I made a mental note to read the conclusion, but IIRC I think this was some system level problem with limit inheritance that was identified and planned to be fixed
Jun 24 12:14:53 <Smerdyakov> Well... are we agreed on two servers, or are one or three still in consideration?
Jun 24 12:15:38 <docelic_> I suppose the "two servers" you and leitgebj mentioned just now, were actually meant as "two admin-only servers", as in fileserver and other server separate ?
Jun 24 12:15:45 <Smerdyakov> No.
Jun 24 12:15:49 <Smerdyakov> Not when I said it, at least
Jun 24 12:15:51 <jch5> and that is one reason I was wondering why Abu can't be used as emergency backup for current sys...just curious..
Jun 24 12:16:03 <docelic_> "<leitgebj> I think that there are lots of reasons to have two systems. "
Jun 24 12:16:07 <Smerdyakov> jch5, we haven't taken the time to set it up.
Jun 24 12:16:25 <leitgebj> I meant two systems, total. One for admin, and one for user access. Nothing else, to start.
Jun 24 12:16:28 <Smerdyakov> jch5, and Abu has some crappy set-up aspects to it... plus unaffordable bandwidth rates at its location.
Jun 24 12:17:29 <Smerdyakov> Using fyodor as an automatic back-up would be feasible once the new stuff is going.
Jun 24 12:17:55 <Smerdyakov> Since we have enough "free bandwidth" to transfer a complete dump of users' web site content in an emergency without going broke.
Jun 24 12:18:02 <jch5> Smerdyakov: that idea works for me..
Jun 24 12:18:23 <docelic_> leitgebj: ah, that's assumed. We were thinking about 2 servers vs. 3 servers choice
Jun 24 12:20:15 <Smerdyakov> _OK_, so it seems that we agree on 2 main servers total.
Jun 24 12:20:48 <docelic_> ok
Jun 24 12:20:51 <jch5> is there any situation where we might need a server just for time of conversion but not ongoing?
Jun 24 12:21:01 <docelic_> no
Jun 24 12:21:18 <leitgebj> I don't think so... that is what fyodor and abu are for! ;)
Jun 24 12:21:47 <jch5> i meant in terms of what is located at HE
Jun 24 12:22:40 <jch5> remember, i'm just an slightly literate user who used to be a mainframe pgmr
Jun 24 12:23:34 <Smerdyakov> jch5, I think the answer is no, but I don't quite understand the question... let's move on, OK? :)
Jun 24 12:25:06 <Smerdyakov> I am officially declaring that we have decided on 2 main servers: one for things that require member log-in and one for things that don't.
Jun 24 12:25:08 <jch5> fine...(once my employer had to rent a second CPU just for period of conversion from old to new)
Jun 24 12:25:30 <Smerdyakov> That isn't meant as an undeniable edict, but just to get things going... make an explicit objection if you want to question the assumption.
Jun 24 12:25:46 <Smerdyakov> Now, what hardware should we buy and where?
Jun 24 12:26:07 <Smerdyakov> unknown_lamer questioned the value of a hardware firewall on the wiki. I'm inclined to agree that we can at least avoid one to start.
Jun 24 12:26:32 <leitgebj> I agree.
Jun 24 12:26:34 <Smerdyakov> So what does that leave us to buy that wouldn't be contained inside one of the servers?
Jun 24 12:26:56 <Smerdyakov> Serial console seems like a potential life-saver and worth any reasonable one-time cost.
Jun 24 12:27:15 <docelic_> You mean a serial console we can connect to remotely ?
Jun 24 12:27:19 <Smerdyakov> Yes.
Jun 24 12:27:27 <docelic_> That's a winner, for sure
Jun 24 12:27:29 <leitgebj> With only two machines, there would also be no reason to a switch. Most modern servers come with two nics, and we can set up a network between them for the LAN.
Jun 24 12:27:48 <Smerdyakov> leitgebj, what if we move Abu in as a generic shell server?
Jun 24 12:28:00 <leitgebj> Can one of you point me to a manufacturer of a serial console? I've never used one.
Jun 24 12:28:14 <docelic_> leitgebj: cyclades.com (long-time linux friends)
Jun 24 12:28:19 <Smerdyakov> leitgebj, no. I'm ignorant. tarsin recommended them to me first, I think.
Jun 24 12:28:30 <leitgebj> Smerdyakov, do we have a wiki page with specs on abu?
Jun 24 12:28:44 <Smerdyakov> leitgebj, I don't think so.
Jun 24 12:29:18 <Smerdyakov> Here's something from the old wiki on it:
Jun 24 12:29:19 <Smerdyakov> *
Jun 24 12:29:19 <Smerdyakov> AMD Athlon Thunderbird 1.33GHz 266MHz FSB 256KB cache
Jun 24 12:29:19 <Smerdyakov> *
Jun 24 12:29:19 <Smerdyakov> CoolJag JAC311C low profile 1U cooling fan and heatsink
Jun 24 12:29:19 <Smerdyakov> *
Jun 24 12:29:19 <Smerdyakov> 2x 256MB Kingston PC2100 CL2.5 DDR RAM
Jun 24 12:29:19 <Smerdyakov> *
Jun 24 12:29:19 <Smerdyakov> Asus A7N266-VM nVidia nForce 220 266MHz FSB motherboard (onboard vid + LAN)
Jun 24 12:29:19 <Smerdyakov> *
Jun 24 12:29:19 <Smerdyakov> 2x IBM 40GB 120GXP IDE 7200rpm 2MB cache 3YR warranty hard disk, mirrored with RAID-1
Jun 24 12:29:19 <Smerdyakov> *
Jun 24 12:29:19 <Smerdyakov> SuperMicro SC811-IDE 1U 200W rackmount case
Jun 24 12:29:19 <Smerdyakov> *
Jun 24 12:29:19 <Smerdyakov> 3Ware Escalade 3W-6410 PCI IDE RAID adapter (4x UDMA66 EIDE ports)
Jun 24 12:29:19 <Smerdyakov> *
Jun 24 12:29:19 <Smerdyakov> 32bit 33MHz PCI riser card
Jun 24 12:29:19 <Smerdyakov> *
Jun 24 12:29:19 <Smerdyakov> 12X IDE slim CDROM (from a Toshiba laptop)
Jun 24 12:29:48 <Smerdyakov> Later amended with these:
Jun 24 12:29:49 <Smerdyakov> *
Jun 24 12:29:49 <Smerdyakov> AMD Duron 900MHz 192KB cache 200MHz FSB
Jun 24 12:29:49 <Smerdyakov> *
Jun 24 12:29:49 <Smerdyakov> CoolJag JAC313C low profile fan and heatsink
Jun 24 12:29:49 <Smerdyakov> *
Jun 24 12:29:49 <Smerdyakov> Gigabyte GA-7VKML VIA KM266 motherboard (onboard vid + LAN)
Jun 24 12:30:45 <leitgebj> As long as it doesn't push us over our space limit, and it doesn't become an important single point of failure, it shouldn't matter. 1U should be fine hanging out at HE until we have a need for the space. Would we want to send it somewhere to reload the OS before shipping it to HE?
Jun 24 12:31:14 <Smerdyakov> We asked for the initial size quote assuming three servers, so I imagine it would be fine.
Jun 24 12:31:18 <docelic_> Depends on HE and setup
Jun 24 12:31:32 <docelic_> I would love to be able to do basic setup without hardware at my place
Jun 24 12:31:42 <Smerdyakov> It would probably be cheaper to have it mailed to some member to set up first.
Jun 24 12:31:45 <docelic_> If we would have serial console, I could install remotely
Jun 24 12:31:47 <leitgebj> He gives us 7U with the plan that we were looking at.
Jun 24 12:31:53 <docelic_> Smerdyakov: ^--
Jun 24 12:32:14 <Smerdyakov> docelic, but you're pretty far from the current location... and extra distance is extra chance of damage.
Jun 24 12:32:19 <leitgebj> I would probably be able to have it shipped to my place and plug it in to a high-speed connection... as long as it isn't in that period when I am moving houses.
Jun 24 12:32:37 <Smerdyakov> mwolson is closest, I think. :)
Jun 24 12:32:38 <docelic_> Smerdyakov: I said I could do it *without* it coming over to me
Jun 24 12:33:12 <Smerdyakov> There probably is very little that needs doing, with the serial console.
Jun 24 12:33:22 <Smerdyakov> Just get the OS going and hardware working..
Jun 24 12:33:25 * Optikal___ has quit (Read error: 110 (Connection timed out))
Jun 24 12:33:33 <docelic_> Smerdyakov: yes, as I said above
Jun 24 12:33:39 <leitgebj> I would have no problem doing that, or mwolson, if he wants...
Jun 24 12:33:40 <docelic_> I can even do a remote isntall with the console
Jun 24 12:33:51 <mwolson> i'd be willing to receive a shipment and do initial installation stuff
Jun 24 12:34:15 <Smerdyakov> mwolson, and it would also involve shipping it out again afterward, of course. :)
Jun 24 12:34:20 <mwolson> i could just bring it with me to work and use the space there :^)
Jun 24 12:34:24 <voider> where will be located the physical servers?
Jun 24 12:34:37 <mwolson> Smerdyakov: right
Jun 24 12:35:13 <Smerdyakov> voider, the quote we got seems to be for Fremont, CA... a 50 minute BART ride away from me.
Jun 24 12:35:22 <mwolson> we usually go with Debian testing, right?
Jun 24 12:35:45 <Smerdyakov> mwolson, I think that's a good idea for the users-can-log-in server; stable sounds good for the admins-only.
Jun 24 12:35:55 <mwolson> ok
Jun 24 12:36:03 <Smerdyakov> And even unstable for Abu if we use it as a generic shell server.
Jun 24 12:36:17 <voider> CA?
Jun 24 12:36:21 <Smerdyakov> California, USA
Jun 24 12:36:21 <voider> US ?
Jun 24 12:36:26 <voider> ok
Jun 24 12:37:19 <voider> Im from quebec, CA
Jun 24 12:37:23 <voider> as in Canada
Jun 24 12:37:36 <Smerdyakov> Let's see... the reason we got into this discussion was because we were considering if we need to obtain a switch.
Jun 24 12:37:46 <Smerdyakov> Let's get back to that... try to finalize the list of new hardware.
Jun 24 12:38:33 <leitgebj> Well, with three machines the dual-nic's won't be sufficient anymore. I need to look at the He.net spec sheet on the colo page of our wiki to see if they provide one.
Jun 24 12:38:37 <Smerdyakov> It sounds like, at most, we want 2 servers, serial console, and a switch.
Jun 24 12:39:38 <mwolson> it'd probably be best to have a switch so that we don't get double traffic on one of the machines (and for the sake of future expansion)
Jun 24 12:40:19 <jch5> and might we name one of them "Hopper"? (My suggested name that came in second place, i think)
Jun 24 12:40:27 <mwolson> also, if the machine that passes the connection on gets rooted or goes down, well, there goes our redundancy plan
Jun 24 12:40:27 <Smerdyakov> jch5, anything's possible. ;)
Jun 24 12:40:49 <Erlang> I guess the names should be voted later right?
Jun 24 12:40:49 <leitgebj> I'm guessing that HE won't provide a switch... we should look around at available models.
Jun 24 12:40:58 <Smerdyakov> We should probably also have a full supply of replacement parts in the cabinet, which also suggests buying servers with similar hardware.
Jun 24 12:41:01 <Smerdyakov> Erlang, yes
Jun 24 12:41:07 <Erlang> good.
Jun 24 12:41:24 * Erlang goes back to lurking mode.
Jun 24 12:41:32 <leitgebj> Smerdyakov, definitely. Support contracts also often come in handy.
Jun 24 12:42:15 <Smerdyakov> So the list seems to be: 2 servers, serial console, switch, replacement parts to cover the 2 servers (assuming failures are rare, so just one of each kind of part (?))
Jun 24 12:42:15 <leitgebj> It is nice to know that someone will ship you a part in a known amount of time when something breaks... and something always does.
Jun 24 12:42:50 <leitgebj> I don't know if we need replacement parts to cover the two servers... I don't think we want to buy a whole third server just for parts. I would rather spend the money on a support contract.
Jun 24 12:43:13 <Smerdyakov> leitgebj, that would mean that, to keep high uptime, we'd really need to be sure one server can take over for a downed neighbor.
Jun 24 12:43:36 <leitgebj> We may be able to get a support contract that guarantees that the part will arrive in 24h if something goes down.
Jun 24 12:43:41 <jch5> i know names come later..(just thought a tenative name might make discussions of which machine being referred to easier)
Jun 24 12:43:51 <Smerdyakov> With parts on hand, it can be fixed in 10 minutes...
Jun 24 12:44:39 <leitgebj> Smerdyakov, it will be tricky with only two servers. Eventually, we should have a pool of servers behind a load balancer, then redundant database back-ends, etc. But I think that this is cost-prohibitive at the moment.
Jun 24 12:45:06 <jch5> but does cost of spare parts on hand outweigh the advantage of getting better/newer parts to replace a failed?
Jun 24 12:45:32 <Smerdyakov> jch5, they would only be meant to be used temporarily and would probably have worse specifications than the usuals.
Jun 24 12:46:42 <jch5> fine...just asking the dumb questions that might prompt a smart thought by admins :)
Jun 24 12:47:44 <Smerdyakov> leitgebj, so deciding by yourself you would choose to try to get a 24-hour part replacement contract?
Jun 24 12:48:25 <leitgebj> yeah, or something similar. I think that we should move to eliminate single points of failure in the near future, but having an entire extra system in parts sitting around doesn't make sense to me.
Jun 24 12:48:46 <Smerdyakov> OK. Some colo provider I corresponded with recommended it.
Jun 24 12:49:44 <Smerdyakov> In any case, it's a feature that's easy to add later with no service interruption! :D
Jun 24 12:49:48 <jch5> if we had 10 or 15 servers, having one as spare parts makes more sense....but one for two...nah
Jun 24 12:50:32 <mwolson> jch5: yeah, having pre-bought spare parts for us seems pointless
Jun 24 12:50:40 <leitgebj> Spare parts are a good idea... but I think that I would rather take a risk without the parts now to save money, and plan for the near future where we can have load-balancers, etc., in order to have no down time in the event of failures.
Jun 24 12:50:58 <Smerdyakov> OK, then it seems our list of hardware to buy is 2xserver, serial console, and switch.
Jun 24 12:51:20 <mwolson> sounds good to me
Jun 24 12:51:39 <leitgebj> I move to at least seriously think about adding in the cost of hardware support contracts to the cost required for our configuration.
Jun 24 12:51:55 <Smerdyakov> That's OK with me.
Jun 24 12:53:10 <Smerdyakov> Time to move on to "where do we get this stuff?"?
Jun 24 12:53:17 <docelic_> sounds good to me
Jun 24 12:53:20 <leitgebj> Sure.
Jun 24 12:54:18 <mwolson> at work newegg.com seems to be popular
Jun 24 12:55:19 <Smerdyakov> PenguinComputing seems ncie for complete machines with combined warranties, but they only sell machines preloaded with RedHat or SUSE. :(
Jun 24 12:55:34 <docelic_> does that matter?
Jun 24 12:55:43 <docelic_> Over a console I can run install from scratch
Jun 24 12:56:01 <Smerdyakov> docelic, seems a waste... and they must charge extra to some extent for the installation.
Jun 24 12:56:50 <leitgebj> Smerdyakov, they probably have some kickstart schema setup where they only hit F12 on bootup to load the OS. Probably just a standard part of their hardware testing anyway.
Jun 24 12:57:17 <mwolson> (newegg in particular might be good for the switch)
Jun 24 12:57:30 <Smerdyakov> mwolson, ah, but not for the servers, right? (No on-site support contracts)
Jun 24 12:57:38 <mwolson> Smerdyakov: right
Jun 24 12:58:01 <leitgebj> I have seen hundreds of dell machines work well in production, and they have solid support contracts. They are also switching to AMD in the near future. That said, I wouldn't mind if we switched to someone like Penguin, it's just that I don't have experience with them.
Jun 24 12:58:02 <Smerdyakov> I wonder what happens if the switch breaks, though. :o
Jun 24 12:58:31 <mwolson> Smerdyakov: it's a very inexpensive item ...
Jun 24 12:58:40 <Smerdyakov> mwolson, OK.
Jun 24 12:58:47 <mwolson> manufacturer probably provides a warranty as well
Jun 24 12:59:02 <leitgebj> I'd like to look at Cisco switches from a vendor with solid support. mwolson, I'm not so sure this is going to be a trivial expense ;)
Jun 24 12:59:25 <mwolson> leitgebj: why? we only have 2-3 things to plug into it
Jun 24 13:00:03 <leitgebj> If this interconnect goes down, all of our services go down.
Jun 24 13:00:19 <leitgebj> It is a single point of failure, and therefore we want it to be extremely relilable and well-supported.
Jun 24 13:02:06 <Smerdyakov> Having servers from Penguin Computing vs. Dell buys geek cred points, if nothing else. :)
Jun 24 13:02:27 <docelic_> ++
Jun 24 13:02:30 <leitgebj> Hey, if the support is there, I'm all for it.
Jun 24 13:02:54 <Smerdyakov> They sell next-day on-site service for 3 years for $119.
Jun 24 13:03:00 <mwolson> leitgebj: http://www.newegg.com/Product/Product.asp?Item=N82E16833122111
Jun 24 13:03:01 <mwolson> ^^ 5-star rated $59.99 gigabit switch
Jun 24 13:03:31 <mwolson> s/59/56/
Jun 24 13:03:56 <leitgebj> mwolson, now what about one that's rack-mountable :)
Jun 24 13:04:42 <leitgebj> I still think that money on a switch is well-spent... that seems like something that is better for an office LAN than a production server environment. Shouldn't we get a managed switch?
Jun 24 13:05:19 <Smerdyakov> I yield to any informed opinion and just reiterate that I'm willing to spend money up-front to ensure quality. :)
Jun 24 13:07:04 <mwolson> *sigh* http://www.newegg.com/Product/Product.asp?Item=N82E16833118021
Jun 24 13:07:04 <mwolson> ^^ $249.99 rack-mountable switch
Jun 24 13:07:12 <docelic_> I can;t say much on switches
Jun 24 13:07:21 <jch5> newegg was recommended to me by a very techie friend...i've had good luck with them also..
Jun 24 13:07:24 <Smerdyakov> Do we want to leave these issues as "homework" and move on to something else?
Jun 24 13:07:29 <leitgebj> I still need to do research on this as well. Can we start a wiki page with some possible products and have a quick meeting, or email list discussion on this later?
Jun 24 13:07:31 <docelic_> good idea
Jun 24 13:07:43 <mwolson> can't we just throw the switch on top of a machine? does it really need to be rack-mountable?
Jun 24 13:07:48 <docelic_> it's too nitpicky for the management level we're trying to keep
Jun 24 13:08:06 <leitgebj> Right. OK with everyone if I start a wiki page on it?
Jun 24 13:08:15 <Smerdyakov> It's always OK to start a wiki page on anything.
Jun 24 13:08:27 <Smerdyakov> Worst that happens is that it's naked pictures of your cat and I have to delete it. ;)
Jun 24 13:08:38 <leitgebj> What's wrong with my cat??? :)
Jun 24 13:09:04 <docelic_> too little fur for a cat
Jun 24 13:09:17 <leitgebj> docelic, lol
Jun 24 13:09:26 <Smerdyakov> OK, I think we have our first action item for the meeting! Whoever is interested is going to research possibilities for the four items we want to buy and post information on the wiki page that leitgebj is creating.
Jun 24 13:10:07 <mwolson> ok
Jun 24 13:10:44 <jch5> and the budget for these items will be how much for each piece of equip?
Jun 24 13:10:59 <Smerdyakov> A billion dollars
Jun 24 13:11:01 <Smerdyakov> Next question?
Jun 24 13:11:29 <jch5> ok..Mr Gates...whatever
Jun 24 13:11:30 <Smerdyakov> I think we'll figure out the budget after we see the options.
Jun 24 13:12:12 * voider has quit ("Leaving")
Jun 24 13:14:01 <Smerdyakov> Other things to discuss include administrative structure.
Jun 24 13:14:18 <Smerdyakov> Is there anything else besides that that we should discuss by the end of the meeting?
Jun 24 13:15:03 <leitgebj> We should set a date to reconvene at, with concrete goals.
Jun 24 13:15:29 <Smerdyakov> OK. That seems to be a meta-point that belongs at the end.
Jun 24 13:16:07 <mwolson> let's see ... so i'm an admin now?
Jun 24 13:18:09 <Smerdyakov> mwolson, sort of
Jun 24 13:18:19 <Smerdyakov> It sounds like no one has any other topics to raise, so let's move to that.
Jun 24 13:18:48 <Smerdyakov> Since a lot of our configuration will be changing, it doesn't seem to make particular sense for either leitgebj or mwolson to start as admins now.
Jun 24 13:19:46 <Smerdyakov> The first task for the new admin team is this hardware and software planning and set-up, as I see it.
Jun 24 13:20:16 <Smerdyakov> After that, we move into maintenance mode, handling issues of predictable time availability, handling support requests, etc..
Jun 24 13:20:53 <Smerdyakov> One thing I want to do is create a new hcoop-sysadmin mailing list that is open to all members.
Jun 24 13:21:17 <Smerdyakov> The idea would be that any sysadmin stuff is discussed there. There's no reason not to get advice from knowledgeable members just because it doesn't make sense to give them all root access.
Jun 24 13:22:27 <mwolson> can we make portal requests trigger an automated email to that new list, for those that prefer email to web forms?
Jun 24 13:22:38 <mwolson> (thinking of metaperl in particular)
Jun 24 13:22:58 <docelic_> Smerdyakov: isn't that admins@ alias now ?
Jun 24 13:23:01 <Smerdyakov> mwolson, portal requests already e-mail everyone subscribed to the associated category.
Jun 24 13:23:05 <docelic_> we could just turn it into a ML
Jun 24 13:23:14 <mwolson> ah, didn't know that
Jun 24 13:23:25 <Smerdyakov> mwolson, you can already subscribe to any that interest you.
Jun 24 13:24:07 <docelic_> I also think Smerdyakov and me can handle issues of porting current infrastructure to new servers, while new admins (mwolson, leitgebj) can be given tasks that don't rely on our current setup
Jun 24 13:24:12 <Smerdyakov> docelic, yes, but the key difference is that anyone can subscribe.
Jun 24 13:24:29 <mwolson> if we make hcoop-sysadmin a catch-all mailing list that gets all portal requests, though, it might be a good thing
Jun 24 13:24:33 <docelic_> Smerdyakov: sure, Im just saying, move admins@ to be a ML and notify admins that it's now a public lisr
Jun 24 13:24:36 <docelic_> list*
Jun 24 13:24:54 <docelic_> Smerdyakov: or not, we should have an alias for admins that is private
Jun 24 13:24:56 <Smerdyakov> mwolson, I don't know about that. Filtering based on subject is quite useful.
Jun 24 13:25:20 <mwolson> (unless we want to separate support from administration, which is somewhat nebulous)
Jun 24 13:25:23 <Smerdyakov> mwolson, I'm thinking of the mailing list as being for discussion, like that surrounding our current planning.
Jun 24 13:25:51 <Smerdyakov> mwolson, I never intended to suggest that the sysadmin list would be the target for e-mailed support requests.
Jun 24 13:25:56 <mwolson> ah, i see
Jun 24 13:26:16 <Smerdyakov> The portal has supported collaboration from all members in handling support issues from the start, even if you didn't read the part of the wiki page that explains that. ;)
Jun 24 13:28:13 <Smerdyakov> So I think I'll create that list after the meeting ends, along with modifying the portal to let people manage their subscriptions to it.
Jun 24 13:28:44 * docelic has quit (Remote closed the connection)
Jun 24 13:28:50 * docelic_ is now known as docelic
Jun 24 13:29:06 <Smerdyakov> And we can switch all potential hcoop-discuss chatter on technical planning issues to hcoop-sysadmin.
Jun 24 13:29:21 <leitgebj> That seems to make sense.
Jun 24 13:30:28 <Smerdyakov> On AdminPolicies, I talk about guaranteed within-24-hours e-mail response time from anyone in an official position.
Jun 24 13:30:44 <Smerdyakov> I think e-mail to hcoop-sysadmin sounds like a good way to notify of times when you're away and unable to provide that.
Jun 24 13:31:05 <docelic> yep
Jun 24 13:31:25 <leitgebj> Sounds good.
Jun 24 13:31:54 <Smerdyakov> There are a number of other suggestions for procedures that I've made on AdminPolicies, but it seems like we can/should wait for those until we are switched over to the new set-up.
Jun 24 13:33:09 <Smerdyakov> Mmmm.... anything else before we jump to the conclusion?
Jun 24 13:33:41 <docelic> not here
Jun 24 13:34:28 <leitgebj> not here either...
Jun 24 13:34:51 <Smerdyakov> OK, then when should we meet next? If we're just doing research on hardware in the interim, one week seems like long enough to me.
Jun 24 13:35:08 <docelic> it's ok
Jun 24 13:35:10 <leitgebj> Agreed.
Jun 24 13:35:29 <Smerdyakov> How about same time next week?
Jun 24 13:35:34 <docelic> yep
Jun 24 13:35:35 <jch5> user agrees also..
Jun 24 13:35:40 <docelic> (I thought that was implied)
Jun 24 13:35:41 <leitgebj> mwolson, is one week OK for you? you had some opinions on hardware...
Jun 24 13:36:09 <mwolson> one week is fine
Jun 24 13:36:25 <Smerdyakov> mwolson, and you can show up on time next week? ;)
Jun 24 13:36:40 <mwolson> will try; got 4pm instead of 14:00 in my head for some reason
Jun 24 13:37:09 <Smerdyakov> Did you follow the link to the time-all-over-the-world page?
Jun 24 13:37:16 <mwolson> yes ...
Jun 24 13:37:19 <Smerdyakov> OK.
Jun 24 13:37:45 <Smerdyakov> I will send out an e-mail announcing the main results of the meeting. Let's see if I've got everything that I should include:
Jun 24 13:37:51 <Smerdyakov> Link to log of this meeting on the wiki
Jun 24 13:38:06 <Smerdyakov> Decided on new admins for after the hosting switch
Jun 24 13:38:10 <Smerdyakov> Decided on HE
Jun 24 13:38:21 <Smerdyakov> Decided on what hardware we want
Jun 24 13:38:34 <Smerdyakov> Researching hardware options/prices now, to be recorded on a page that leitgebj should tell me about :)
Jun 24 13:38:48 <Smerdyakov> Next meeting in a week, hopefully with ntk in attendance so that it's a real board meeting
Jun 24 13:39:00 <leitgebj> Sounds good, I will do that in a minute.
Jun 24 13:39:09 <Smerdyakov> Oh, and add new hcoop-sysadmin list in there.
Jun 24 13:39:39 <leitgebj> I will update the wiki page to reflect our conclusions here, except for the IRC meeting log that Smerdyakov will upload, OK?
Jun 24 13:39:42 <Smerdyakov> I think that's the whole list.
Jun 24 13:39:44 <Smerdyakov> Sure.
Jun 24 13:39:48 <Smerdyakov> Anything I missed, anyone?
Jun 24 13:40:03 <jch5> thanks for answering my questions!
Jun 24 13:40:06 <leitgebj> Looks good to me.
Jun 24 13:40:58 <Smerdyakov> OK, then I guess that concludes the meeting!