[07:26:22] 10DBA, 10Analytics, 10Operations, 10ops-eqiad, and 2 others: rack/setup/install labsdb1012.eqiad.wmnet - https://phabricator.wikimedia.org/T215231 (10Marostegui) @elukey the problem is that if we add it to the existing proxies, they'll be reachable by wikireplica users, as there is a round robin there, so...
[08:09:25] jynus: o/ - if you have time, what would be the best name for the new role for labsdb1012?
[08:09:40] currently it is labs::db::wikireplica_analytics
[08:10:56] I got your comment now and I like it, my main concern though is duplicating also the role's hiera config, and long term keeping it in sync with labs::db::wikireplica_analytics (since in theory we agreed that labsdb1012 could be swapped in if a labsdb breaks badly as replcement)
[08:11:16] (since the analytics team will use it only once a month initially and we feel bad :)
[08:13:45] labs::db::wikireplica_analytics isn't an analytics role
[08:14:58] what is the role's hiera config at the moment?
[08:16:26] the pt-kill config?
[08:16:35] because yo do want a different configuration for that
[08:18:37] wikireplica_analytics_dedicated? I don't know, I do not care much about the name
[08:25:22] jynus: sure I know it isn't an analytics role
[08:25:23] node /labsdb10(10|11|12)\.eqiad\.wmnet/ {
[08:25:23] role(labs::db::wikireplica_analytics)
[08:25:23] }
[08:25:32] but we chose to use the same role for labsdb1012
[08:25:58] what I am saying is that you actually need different pt-kill config
[08:26:17] sure but I have no idea about the pt-kill configuration
[08:26:43] that is ok, that is why we are here
[08:26:48] if you guys are ok with the new role/config/ec.. I am fine as well :)
[08:26:52] just saying more resons to be a separate role
[08:27:12] while not touching the profile
[08:27:46] elukey: hieradata/role/common/labs/db/wikireplica_analytics.yaml
[08:27:49] there are things that I agree with you of that structure that are a bit overblown, and I would agree with you
[08:27:58] that is the pt-kill config, we'd need an if or something for labsdb1012
[08:28:05] but complains to puppet style guide, not us :-D
[08:28:43] no iff
[08:28:48] just different hiera keys
[08:29:10] hieradata/role/common/labs/db/wikireplica_analytics_dedicated.yaml
[08:29:13] or something
[08:29:38] I'd prefert wikireplica_analytics::dedicated if you don't mind
[08:29:46] I can amend the puppet patch now
[08:29:53] sure
[08:29:54] whatever, as I said, I don't care about the name
[08:32:07] that is cloud name space, they "rule" over it, so talk to them
[08:33:22] sure
[08:33:32] so I am adding the pt-kill config in hiera
[08:33:42] suggestions about diffs from the other labsdbs?
[08:34:07] so the analytics one kills queries longer than 4 hours
[08:34:17] it is up to you to decide if that is too aggressive or not :)
[08:34:24] ack lemme ask to my team
[08:34:48] if it is only going to be used by your team, I guess it doesn't make much sense to have it
[08:34:56] maybe something like 12h or 24h (to at least have something)
[08:35:26] 4h seems fine from what people are telling me
[08:35:37] sure, your team's call :)
[08:35:53] bare in mind that if we use this as replacement in case of emergencies we might want to have some protection in place
[08:36:00] so I'll change pt-kill only if really needed
[08:36:21] sounds good to me .)
[08:56:27] all right ready for review - https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/494874/
[09:31:44] I am going to merge this https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/494880/
[09:40:13] cool, a delayed the review because I wanted to do a scan of more stuff we may be missing
[09:40:37] but evertyhing there is ok, I just am not sure if everything that should be there is there :-D
[09:41:10] Yeah, I couldn't find anything else
[09:41:40] so you have my +1, no worth delaying I can do another check at another time
[09:41:48] thank you!
[09:42:01] let's setup a meeting to discuss backups, not urgent
[09:42:48] sure, send me an invite for next week anytime you like
[09:52:32] all backups worked after the latest deploy
[09:52:42] \o/
[09:52:44] and stats too?
[09:53:21] yep
[09:53:25] good job!
[10:09:01] 10Blocked-on-schema-change, 10DBA, 10Patch-For-Review, 10Schema-change: Dropping page.page_no_title_convert on wmf databases - https://phabricator.wikimedia.org/T86342 (10Marostegui)
[10:58:44] Avengers: MariaDB Backups: Endgame https://gerrit.wikimedia.org/r/494899
[11:08:47] haha
[11:08:53] Will take a look when ready for review :)
[12:11:27] one note on https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/492321/6: os_version() doesn't currently match on buster at this point, due to some internal LSB settings it still identifies as "testing". but that should be changed soon and db1114 is only a test host, so probably fine
[12:11:49] the alternative is to match $::lsbdistcodename == 'buster', that already works
[12:18:05] I supposed so, and that is ok
[12:18:29] as long as it will eventually work
[12:38:29] ack
[13:56:51] all labs, sanitariums and sanitariums masters are now running .38
[14:32:19] great
[17:53:04] 10DBA, 10Data-Services, 10Toolforge, 10Tracking: Certain tools users create multiple long running queries that take all memory and/or CPU from labsdb hosts, slowing it down and potentially crashing (tracking) - https://phabricator.wikimedia.org/T119601 (10jcrespo)