This page is to help us collect things we want to work on and get done in 2013. Initially it will serve to help us organize what we want to get done at the upcoming Fudcon Lawerence. (hackfests, presentations, etc).

+

This page is to help us collect things we want to work on and get done in 2013. Initially it will serve to help us organize what we want to get done at the upcoming Fudcon Lawerence. (hackfests, presentations, etc). After that it may be repurposed to note those things we are actually going to work on

+

in the coming year.

= fudcon =

= fudcon =

−

Lets coordinate and gather things here we want to do at fudcon. (Don't forget to add these to the main fudcon page as soon as we have decided on them)

+

Lets coordinate and gather things here we want to do at fudcon. (Don't forget to add these to the main fudcon page as soon as we have decided on them). I am planning to try and have a high level "These are things we want to work on" session saturday morning. Hopefully everyone can attend that and then we can try and go off and do those things.

== technical sessions (friday) ==

== technical sessions (friday) ==

+

+

* Several infrastructure folks will be giving tech talks. Please attend and heckle^Whelp.

== hackfests (saturday and sunday) ==

== hackfests (saturday and sunday) ==

Line 32:

Line 35:

* Move publictest to the cloud and create a sundown on them

* Move publictest to the cloud and create a sundown on them

+

* Move dev instances all to the cloud.

* Make a push-based fasClient with ansible; replace the fasClient cron job on the infra boxes with it.

* Make a push-based fasClient with ansible; replace the fasClient cron job on the infra boxes with it.

== In the Fedora 20 cycle ==

== In the Fedora 20 cycle ==

+

+

= Idea box for 2012 and beyond =

+

+

* Integrate jenkins into our infrastructure and framework (pingou)

+

* Make a clearer division between back-end and front-end in our (web)-app (pingou)

+

** Helps with testing (unit-tests)

+

** Reduces the dependency of the application to a particular framework

+

* Automate the generation of the statistics report: https://fedoraproject.org/wiki/Statistics (pingou)

+

** Ok, I think I have this covered: https://github.com/pypingou/fedora-stats (cf http://ambre.pingoured.fr/fedora-stats2/ )

+

* Reduce the number of framework used ? (pingou)

+

** This does mean porting old app to new(er) framework

+

* Question: what are we going to do when/if EL7 is released in 2013 ? (From an app point of view)

+

* Setup an Intrustion Detection System (lmacken)

+

** Have had great experiences with using [http://www.openinfosecfoundation.org suricata] personally...

+

* restructure our app/proxy layout: (skvidal)

+

** our current app model makes it difficult to determine which app is causing the problem. so our solutions tend to be pretty coarse-grained. Given the failure-prone state of our apps it would seem like we should adopt a model which makes it simpler to see where the problems are coming from. As our apps stabilize we can move to an environment sharing more resources.

+

* ARM servers in infrastructure

+

** Discuss issues around using some ARM instances for our needs.

+

** Would need to likely use Fedora instead of RHEL

+

** What things would be good for them?

+

* Revamp nagios

+

** Use check_mk on all machines and add a small amount of custom checks on top.

** By make it a friend of fedmsg so we can trigger actions (push-mode) only when needed based on server's profile.

+

* Interactive shell for fas administration (laxathom)

+

** so we can avoid hacking directly into fas's DB by example.

+

* Koji-stg (laxathom)

+

** just finish what should be done here.

+

* fedorahosted - auto-setup-scm-project (laxathom)

+

** Use the rework above from fasClient so we can trigger creation (push-mode) of scm once related group has been created from fas.

+

* FAS v3 (laxathom)

+

** Started a proposal at https://fedorahosted.org/fas/wiki/docs/draft/FAS3#no1

+

** This proposal will need some rework with all the nice features we added in 2012.

= old stuff from 2011 / 2012 =

= old stuff from 2011 / 2012 =

Line 47:

Line 94:

* Update pkgdb to have an admin console (no more direct db needs)

* Update pkgdb to have an admin console (no more direct db needs)

* Fix the Django auth providers to be faster

* Fix the Django auth providers to be faster

−

* Move publictest to the cloud and create a sundown on them

* Automated hosted projects (*)

* Automated hosted projects (*)

* Automated creation of new machines -- run one command and it's up

* Automated creation of new machines -- run one command and it's up

Line 58:

Line 104:

** Reduce koji's resources

** Reduce koji's resources

* Finish and deploy coprs

* Finish and deploy coprs

−

* go through list of rpm -Va on all hosts (in /var/tmp/global-rpm-va on puppet01) and make sure all the files there have counterparts in puppet to explain their changes (*)

+

* the puppet nodenames do not match the hostnames in nagios. Add aliases to the nagios hostnames to match them up correctly. This will allow us to trigger passive checks using nsca. (ties to nagios revamp)

−

* Look at whether the git email hook can be done async. If so, make it async and change it to query the packagedb for people to email instead of using the PACKAGE-owner email aliases. (This will eliminate bounces when the alias does not exist, for instance, new package requests and when the only owner of a package is orphan@fp.o)

−

* the puppet nodenames do not match the hostnames in nagios. Add aliases to the nagios hostnames to match them up correctly. This will allow

−

us to trigger passive checks using nsca.

* Setup a schedule for rebooting hosts (to test for broken hw when it's not a critical point in the release cycle)

* Setup a schedule for rebooting hosts (to test for broken hw when it's not a critical point in the release cycle)

2013 Fedora Infrastructure tasks

overview

This page is to help us collect things we want to work on and get done in 2013. Initially it will serve to help us organize what we want to get done at the upcoming Fudcon Lawerence. (hackfests, presentations, etc). After that it may be repurposed to note those things we are actually going to work on
in the coming year.

fudcon

Lets coordinate and gather things here we want to do at fudcon. (Don't forget to add these to the main fudcon page as soon as we have decided on them). I am planning to try and have a high level "These are things we want to work on" session saturday morning. Hopefully everyone can attend that and then we can try and go off and do those things.

technical sessions (friday)

Several infrastructure folks will be giving tech talks. Please attend and heckle^Whelp.

hackfests (saturday and sunday)

cloudy with a chance of infrastructure - finish up stuff around private clouds, move to production.

revamp our apprentice/new contributor process - figure out a way to get more people involved long term. (more mentoring?)

ansible - figure out any setup and questions, timetable to replace puppet

lightning talks (friday)

2013

This will be a list of things we want to get done in those timeframes.

2013 infrastructure FAD

The fad worked great to get 2 factor auth done, if we can get funding we should consider another on another topic. Ideas welcome here.

monitoring - fix nagios, revamp how we manage it, make it stop bothering us all, but still tell us about issues, etc.

In the Fedora 19 cycle

Move publictest to the cloud and create a sundown on them

Move dev instances all to the cloud.

Make a push-based fasClient with ansible; replace the fasClient cron job on the infra boxes with it.

In the Fedora 20 cycle

Idea box for 2012 and beyond

Integrate jenkins into our infrastructure and framework (pingou)

Make a clearer division between back-end and front-end in our (web)-app (pingou)

our current app model makes it difficult to determine which app is causing the problem. so our solutions tend to be pretty coarse-grained. Given the failure-prone state of our apps it would seem like we should adopt a model which makes it simpler to see where the problems are coming from. As our apps stabilize we can move to an environment sharing more resources.

ARM servers in infrastructure

Discuss issues around using some ARM instances for our needs.

Would need to likely use Fedora instead of RHEL

What things would be good for them?

Revamp nagios

Use check_mk on all machines and add a small amount of custom checks on top.

Automate adding nodes, etc

Extend 2 factor auth or other security measures past sysadmin groups?

hosted? pkgs? specific groups?

signed commits?

Fedorahosted-ng

Ditch trac for something better?

gitlabhq or other easier interface for git repos?

Decentralize!

Search engine? try and get dpsearch working again?

Rework fasClient (laxathom)

By make it daemonize-able

By make it a friend of fedmsg so we can trigger actions (push-mode) only when needed based on server's profile.

Interactive shell for fas administration (laxathom)

so we can avoid hacking directly into fas's DB by example.

Koji-stg (laxathom)

just finish what should be done here.

fedorahosted - auto-setup-scm-project (laxathom)

Use the rework above from fasClient so we can trigger creation (push-mode) of scm once related group has been created from fas.

This proposal will need some rework with all the nice features we added in 2012.

old stuff from 2011 / 2012

Here's stuff we talked about in the past and never got done:

Upgrade TurboGears1 apps to TurboGears2

Write automated tests using TG2's test framework

Fix the FAS authenticators to be less chatty

Put fas session information into memcached

Update FAS to have an admin console (no more direct db needs)

Update pkgdb to have an admin console (no more direct db needs)

Fix the Django auth providers to be faster

Automated hosted projects (*)

Automated creation of new machines -- run one command and it's up

glusterfs/cloudfs fedorapeople filesystem

Replicate db so that we don't have a SPOF

logging sucks (*)

IPs hit proxies but we also need them to hit the app servers. (*)

Fas needs to log more actions to its database (this is in a new version of FAS, we just need to upgrade)

Do periodic reinstallations of guests (like app servers) so that we know there's nothing changed not in puppet.

Reduce koji's resources

Finish and deploy coprs

the puppet nodenames do not match the hostnames in nagios. Add aliases to the nagios hostnames to match them up correctly. This will allow us to trigger passive checks using nsca. (ties to nagios revamp)

Setup a schedule for rebooting hosts (to test for broken hw when it's not a critical point in the release cycle)