Page MenuHomePhabricator

ABran-WMF (arnaudb)
User

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Aug 29 2023, 8:30 AM (50 w, 4 d)
Availability
Available
LDAP User
Arnaudb
MediaWiki User
ABran-WMF [ Global Accounts ]

Recent Activity

Yesterday

ABran-WMF moved T363665: Create a cookbook to restart mariadb on all sanitarium hosts from In progress to Pending comment on the DBA board.
Fri, Aug 16, 1:02 PM · Patch-For-Review, DBA
ABran-WMF added a comment to T363665: Create a cookbook to restart mariadb on all sanitarium hosts.

it has its dedicated PS now: https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/1063167

Fri, Aug 16, 9:35 AM · Patch-For-Review, DBA
ABran-WMF closed T372208: Degraded RAID on es1029 as Resolved.

host is fully repooled with no issue.

Fri, Aug 16, 6:28 AM · DBA, DC-Ops, SRE, ops-eqiad

Wed, Aug 14

ABran-WMF moved T363665: Create a cookbook to restart mariadb on all sanitarium hosts from Blocked to In progress on the DBA board.

Here is the cookbook update with the proper methods, from the released spicerack version.

Wed, Aug 14, 3:41 PM · Patch-For-Review, DBA
ABran-WMF changed the status of T372208: Degraded RAID on es1029 from In Progress to Open.

thanks @VRiley-WMF for the disk swap, I'll keep the task open to check how the rebuild went later on

Wed, Aug 14, 2:30 PM · DBA, DC-Ops, SRE, ops-eqiad
ABran-WMF moved T367282: Migrate mysql icinga alerts to alert manager - read only status from Ready to Blocked on the DBA board.
Wed, Aug 14, 2:09 PM · Patch-For-Review, DBA
ABran-WMF moved T367279: Migrate mysql icinga alerts to alert manager - seconds_behind_master + threads (replication/io) from In progress to Blocked on the DBA board.
Wed, Aug 14, 2:09 PM · Patch-For-Review, DBA
ABran-WMF added a comment to T372287: Create new translate_message_group_subscriptions table on Wikimedia wikis with the Translate extension installed.

sanitarium instances have been restarted with the patch

Wed, Aug 14, 1:10 PM · LPL Essential (LPL Essential 2024 Jul-Sep), Localization Infrastructure FY2023-24, MediaWiki-extensions-Translate, DBA
ABran-WMF added a comment to T372208: Degraded RAID on es1029.

Sure, let me know when you're up today

Wed, Aug 14, 7:09 AM · DBA, DC-Ops, SRE, ops-eqiad
ABran-WMF moved T371342: db1238 bus critical errors from Triage to In progress on the DBA board.
Wed, Aug 14, 6:51 AM · DBA, SRE, ops-eqiad, DC-Ops, Data-Persistence

Tue, Aug 13

ABran-WMF added a comment to T372287: Create new translate_message_group_subscriptions table on Wikimedia wikis with the Translate extension installed.

@Marostegui I've sent this patch:

Tue, Aug 13, 3:39 PM · LPL Essential (LPL Essential 2024 Jul-Sep), Localization Infrastructure FY2023-24, MediaWiki-extensions-Translate, DBA
ABran-WMF added a comment to T372208: Degraded RAID on es1029.

@VRiley-WMF please let me know when you're ready, I'll depool the node then

Tue, Aug 13, 7:13 AM · DBA, DC-Ops, SRE, ops-eqiad
ABran-WMF moved T371049: prometheus-mysqld-exporter doesn't fully support multi-instances for pt-heartbeat from Triage to Refine on the DBA board.
Tue, Aug 13, 6:51 AM · DBA

Mon, Aug 12

ABran-WMF moved T372287: Create new translate_message_group_subscriptions table on Wikimedia wikis with the Translate extension installed from Triage to In progress on the DBA board.
Mon, Aug 12, 3:55 PM · LPL Essential (LPL Essential 2024 Jul-Sep), Localization Infrastructure FY2023-24, MediaWiki-extensions-Translate, DBA
ABran-WMF moved T371482: Better error handling for db-switchover from Triage to Ready on the DBA board.
Mon, Aug 12, 3:53 PM · DBA
ABran-WMF moved T371483: Dry run for db-switchover when moving slaves from Triage to Ready on the DBA board.
Mon, Aug 12, 3:53 PM · DBA
ABran-WMF moved T371927: Degraded RAID on db1174 from Triage to In progress on the DBA board.
Mon, Aug 12, 3:53 PM · DBA, DC-Ops, SRE, ops-eqiad
ABran-WMF moved T370852: Migrate codfw row C & D database hosts to new Leaf switches from Triage to In progress on the DBA board.

@ABran-WMF please coordinate with @cmooney for this.

Mon, Aug 12, 3:53 PM · DBA, ops-codfw, Infrastructure-Foundations, netops, DC-Ops, SRE
ABran-WMF added a comment to T372208: Degraded RAID on es1029.

forgot to add:

Mon, Aug 12, 3:34 PM · DBA, DC-Ops, SRE, ops-eqiad
ABran-WMF changed the status of T371759: Prepare and check storage layer for bdrwiki, a subtask of T371757: Create Wikipedia West Coast Bajau, from Open to In Progress.
Mon, Aug 12, 3:07 PM · MW-1.43-notes (1.43.0-wmf.17; 2024-08-06), Wiki-Setup (Create)
ABran-WMF changed the status of T371759: Prepare and check storage layer for bdrwiki from Open to In Progress.

All done, ready for the views creation.

Mon, Aug 12, 3:07 PM · Data-Services, DBA
ABran-WMF changed the status of T372208: Degraded RAID on es1029 from Open to In Progress.
Mon, Aug 12, 12:44 PM · DBA, DC-Ops, SRE, ops-eqiad
ABran-WMF placed T371984: Q1:rack/setup/install backup2012 up for grabs.

Thank you for the explanation 🙏

Mon, Aug 12, 9:50 AM · SRE, Data-Persistence, Data-Persistence-Backup, ops-codfw, DC-Ops

Fri, Jul 19

ABran-WMF updated the task description for T367781: Drop deprecated abuse filter fields on wmf wikis.
Fri, Jul 19, 12:52 PM · Data-Engineering, Schema-change-in-production, DBA
ABran-WMF updated subscribers of T365998: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 -lsw1-f3-eqiad .

@Marostegui and I will be absent on tuesday, hosts have been depooled and are ready.

Fri, Jul 19, 12:25 PM · SRE-swift-storage, DBA, Data-Persistence, Infrastructure-Foundations, netops, SRE
ABran-WMF claimed T370394: Drop gb_by from globalblocks table.
Fri, Jul 19, 7:04 AM · Data-Engineering, Schema-change-in-production, DBA

Jul 18 2024

ABran-WMF added a comment to T365998: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 -lsw1-f3-eqiad .

data-persistence hosts handled, ready whenever you are @cmooney

Jul 18 2024, 2:49 PM · SRE-swift-storage, DBA, Data-Persistence, Infrastructure-Foundations, netops, SRE
ABran-WMF closed T367496: MySQL_legacy Spicerack - fixes as Resolved.

Since https://github.com/wikimedia/operations-software-spicerack/blob/v8.8.0/CHANGELOG.rst has been tested, this can be considered quite done, only remains the improvement of exception carrying when a SQL command fails on a host, tracked in T370419

Jul 18 2024, 2:38 PM · DBA
ABran-WMF triaged T370419: Improve Exceptions on command failure as Low priority.
Jul 18 2024, 2:37 PM · DBA
ABran-WMF created T370419: Improve Exceptions on command failure.
Jul 18 2024, 2:36 PM · DBA
ABran-WMF changed the status of T370265: Create new translate_cache table on Wikimedia wikis with the Translate extension installed from Open to In Progress.
Jul 18 2024, 6:15 AM · MW-1.43-notes (1.43.0-wmf.19; 2024-08-20), Data-Persistence, LPL Essential (LPL Essential 2024 Jul-Sep), MediaWiki-extensions-Translate

Jul 17 2024

ABran-WMF changed the status of T367279: Migrate mysql icinga alerts to alert manager - seconds_behind_master + threads (replication/io), a subtask of T315866: Migrate mysql icinga alerts to alert manager, from Open to In Progress.
Jul 17 2024, 3:34 PM · Patch-For-Review, DBA
ABran-WMF changed the status of T367279: Migrate mysql icinga alerts to alert manager - seconds_behind_master + threads (replication/io) from Open to In Progress.
Jul 17 2024, 3:34 PM · Patch-For-Review, DBA
ABran-WMF updated the task description for T367279: Migrate mysql icinga alerts to alert manager - seconds_behind_master + threads (replication/io).
Jul 17 2024, 3:34 PM · Patch-For-Review, DBA
ABran-WMF added a comment to T369855: db1179 crashed - hardware issues.

sure thing!

Jul 17 2024, 2:54 PM · SRE, DC-Ops, ops-eqiad, DBA
ABran-WMF closed T367280: Migrate mysql icinga alerts to alert manager - memory pressure as Declined.

Could be considered as redundant of T367283 which would offer a more specific angle.

Jul 17 2024, 2:13 PM · Patch-For-Review, DBA
ABran-WMF closed T367280: Migrate mysql icinga alerts to alert manager - memory pressure, a subtask of T315866: Migrate mysql icinga alerts to alert manager, as Declined.
Jul 17 2024, 2:13 PM · Patch-For-Review, DBA
ABran-WMF added a comment to T370029: cumin2002 db-switchover debug.

done!

Jul 17 2024, 1:07 PM · DBA
ABran-WMF added a comment to T370029: cumin2002 db-switchover debug.

I've updated

  • dbstore1009
  • db1164 (m1)
  • db1176 (m5)
Jul 17 2024, 1:04 PM · DBA
ABran-WMF added a comment to T370029: cumin2002 db-switchover debug.

The issue is due to the fact that cumin2002 has python3-wmfmariadbpy at version 0.10 while cumin1002 and most of the host with that package has version 0.11.2.
[...] how do you manage the versioning and upgrade of this package?
Full list of hosts with the old version available in Debmonitor: https://debmonitor.wikimedia.org/packages/python3-wmfmariadbpy

Jul 17 2024, 1:01 PM · DBA
ABran-WMF updated the task description for T367781: Drop deprecated abuse filter fields on wmf wikis.
Jul 17 2024, 10:00 AM · Data-Engineering, Schema-change-in-production, DBA
ABran-WMF updated subscribers of T369855: db1179 crashed - hardware issues.

This server has been down for a few days, @wiki_willy please let me know if I can help

Jul 17 2024, 6:41 AM · SRE, DC-Ops, ops-eqiad, DBA
ABran-WMF added a comment to T367781: Drop deprecated abuse filter fields on wmf wikis.

thanks! will roll the change there as well

Jul 17 2024, 6:32 AM · Data-Engineering, Schema-change-in-production, DBA

Jul 16 2024

ABran-WMF added a comment to T365997: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 -lsw1-f2-eqiad .

dbstore1009 has replication up to date on all 3 instances

Jul 16 2024, 3:30 PM · SRE-swift-storage, DBA, Data-Persistence, Infrastructure-Foundations, netops, SRE
ABran-WMF added a comment to T370029: cumin2002 db-switchover debug.

We could probably try and rely on read_default_file like we do in other tools

Jul 16 2024, 1:50 PM · DBA
ABran-WMF updated the task description for T367781: Drop deprecated abuse filter fields on wmf wikis.
Jul 16 2024, 8:12 AM · Data-Engineering, Schema-change-in-production, DBA
ABran-WMF updated the task description for T367781: Drop deprecated abuse filter fields on wmf wikis.
Jul 16 2024, 7:24 AM · Data-Engineering, Schema-change-in-production, DBA
ABran-WMF updated the task description for T367781: Drop deprecated abuse filter fields on wmf wikis.
Jul 16 2024, 7:15 AM · Data-Engineering, Schema-change-in-production, DBA

Jul 15 2024

ABran-WMF created P66481 testing puppet.
Jul 15 2024, 1:13 PM
ABran-WMF triaged T368874: Productionize dbproxy102[89] as Medium priority.
Jul 15 2024, 12:48 PM · DBA
ABran-WMF triaged T370029: cumin2002 db-switchover debug as Medium priority.

debug is already in progress, this task is to track what's been done

Jul 15 2024, 10:13 AM · DBA
ABran-WMF created T370029: cumin2002 db-switchover debug.
Jul 15 2024, 10:13 AM · DBA
ABran-WMF added a comment to T362529: Create a Wikimedians of United Arab Emirates User Group Wiki.

@ABran-WMF, @fnegri - I believe that we are ready to run the sre.wikireplicas.add-wiki cookbook for this wiki, which should make it available on both clouddb* and an-redacteddb1001 hosts.
Are you happy for us (myself and @Stevemunene ) to run that now, or is there anything else that either of you feels need be done first?

Jul 15 2024, 10:12 AM · MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Data-Services, Wiki-Setup (Create)
ABran-WMF moved T362824: Q#:rack/setup/install dbproxy200[5-8] from Triage to In progress on the DBA board.
Jul 15 2024, 7:17 AM · DBA, SRE, ops-codfw, Data-Persistence, DC-Ops

Jul 12 2024

ABran-WMF updated the task description for T367781: Drop deprecated abuse filter fields on wmf wikis.
Jul 12 2024, 1:27 PM · Data-Engineering, Schema-change-in-production, DBA
ABran-WMF added a comment to T367781: Drop deprecated abuse filter fields on wmf wikis.

execution collided with T367856 on s7, stopped and repooling will resume monday.

Jul 12 2024, 9:11 AM · Data-Engineering, Schema-change-in-production, DBA
ABran-WMF added a project to T369855: db1179 crashed - hardware issues: ops-eqiad.

I am unable to reach it via management interface either, it might need a bit of hands on

Jul 12 2024, 9:01 AM · SRE, DC-Ops, ops-eqiad, DBA
ABran-WMF changed the status of T369855: db1179 crashed - hardware issues from Open to In Progress.
Jul 12 2024, 8:51 AM · SRE, DC-Ops, ops-eqiad, DBA

Jul 11 2024

ABran-WMF awarded T362893: Spicerack support for dbctl a Love token.
Jul 11 2024, 3:56 PM · Patch-For-Review, DBA, Infrastructure-Foundations, conftool, SRE-tools, Spicerack
ABran-WMF added a comment to T365996: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-f1-eqiad .

dbhost repooling
dbproxy reloaded
backuphost checked and looks green

Jul 11 2024, 2:47 PM · SRE-swift-storage, DBA, Data-Persistence, Infrastructure-Foundations, netops, SRE
ABran-WMF awarded T369720: Clean pt-heartbeat from read only external store nodes a Party Time token.
Jul 11 2024, 10:27 AM · DBA
ABran-WMF added a project to T362529: Create a Wikimedians of United Arab Emirates User Group Wiki: Data-Services.

@Zabe it seems we were missing the "storage layer" task we usually get. Anyway, this is done on our side.

Jul 11 2024, 7:47 AM · MW-1.43-notes (1.43.0-wmf.15; 2024-07-23), Data-Services, Wiki-Setup (Create)

Jul 10 2024

ABran-WMF added a comment to T365993: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 - lsw1-e1-eqiad.

db1190 repooling
dbproxy reloaded

Jul 10 2024, 3:46 PM · SRE-swift-storage, DBA, Data-Persistence, Infrastructure-Foundations, netops, SRE
ABran-WMF triaged T369720: Clean pt-heartbeat from read only external store nodes as Medium priority.
Jul 10 2024, 1:57 PM · DBA
ABran-WMF created T369720: Clean pt-heartbeat from read only external store nodes.
Jul 10 2024, 1:56 PM · DBA
ABran-WMF closed T367278: Migrate mysql icinga alerts to alert manager - pt-heartbeat + scaffolding as Resolved.

This is done, we'll iterate and monitor elsewhere if needed.

Jul 10 2024, 1:40 PM · Patch-For-Review, DBA
ABran-WMF closed T367278: Migrate mysql icinga alerts to alert manager - pt-heartbeat + scaffolding, a subtask of T315866: Migrate mysql icinga alerts to alert manager, as Resolved.
Jul 10 2024, 1:39 PM · Patch-For-Review, DBA
ABran-WMF added a comment to T367278: Migrate mysql icinga alerts to alert manager - pt-heartbeat + scaffolding.

Exporter new configuration has been merged, it's rolling out on production. Rules have been merged as well

Jul 10 2024, 1:37 PM · Patch-For-Review, DBA
ABran-WMF changed the status of T369715: Gather all mariadb host under the same prometheus label from Open to In Progress.
Jul 10 2024, 1:23 PM · Observability-Metrics, Observability-Alerting, DBA
ABran-WMF placed T369715: Gather all mariadb host under the same prometheus label up for grabs.
Jul 10 2024, 1:23 PM · Observability-Metrics, Observability-Alerting, DBA
ABran-WMF created T369715: Gather all mariadb host under the same prometheus label.
Jul 10 2024, 1:22 PM · Observability-Metrics, Observability-Alerting, DBA
ABran-WMF moved T369654: Q1:rack/setup/install db22[21-40] from Triage to Blocked on the DBA board.
Jul 10 2024, 8:23 AM · DBA, SRE, ops-codfw, Data-Persistence, DC-Ops
ABran-WMF moved T369658: Q1:rack/setup/install pc2017 from Triage to Blocked on the DBA board.
Jul 10 2024, 8:21 AM · DBA, SRE, Data-Persistence, ops-codfw, DC-Ops
ABran-WMF placed T369661: Q1:rack/setup/install pc1017 up for grabs.
Jul 10 2024, 8:20 AM · DBA, SRE, Data-Persistence, ops-eqiad, DC-Ops
ABran-WMF claimed T369661: Q1:rack/setup/install pc1017.
Jul 10 2024, 8:19 AM · DBA, SRE, Data-Persistence, ops-eqiad, DC-Ops

Jul 8 2024

ABran-WMF closed T367281: Migrate mysql icinga alerts to alert manager - disk pressure as Resolved.
Jul 8 2024, 1:36 PM · Patch-For-Review, DBA
ABran-WMF closed T367281: Migrate mysql icinga alerts to alert manager - disk pressure, a subtask of T315866: Migrate mysql icinga alerts to alert manager, as Resolved.
Jul 8 2024, 1:35 PM · Patch-For-Review, DBA
ABran-WMF awarded T368354: Modify db-mysql to connect to an-redacteddb1001 from cumin hosts a Party Time token.
Jul 8 2024, 9:59 AM · Patch-For-Review, Data-Services, Data-Persistence, Data-Platform-SRE (2024.06.17 - 2024.07.07)
ABran-WMF added a comment to T368354: Modify db-mysql to connect to an-redacteddb1001 from cumin hosts.

we've not seen any regression since you released the update, I think you're good to go!

Jul 8 2024, 6:37 AM · Patch-For-Review, Data-Services, Data-Persistence, Data-Platform-SRE (2024.06.17 - 2024.07.07)

Jul 5 2024

ABran-WMF added a comment to T367278: Migrate mysql icinga alerts to alert manager - pt-heartbeat + scaffolding.

We could also go for Misc hosts in codfw as testing.

Jul 5 2024, 2:15 PM · Patch-For-Review, DBA
ABran-WMF added a comment to T367781: Drop deprecated abuse filter fields on wmf wikis.

will do!

Jul 5 2024, 12:31 PM · Data-Engineering, Schema-change-in-production, DBA
ABran-WMF claimed T239814: Automate DB upgrades.
Jul 5 2024, 12:26 PM · User-Ladsgroup, DBA
ABran-WMF updated subscribers of T367278: Migrate mysql icinga alerts to alert manager - pt-heartbeat + scaffolding.

@fgiunchedi fyi I've started rolling out the backport version on clouddb hosts, I've left aside clouddb1019 as @fnegri told me it was kind of unstable recently. I'll let it sit for the week-end before merging anything and go for the remaining hosts that need that version installed. @jcrespo don't worry about this change, it's very safe and monitored on my side, it'll cover backup hosts that have mysqld-exporter installed as well.

Jul 5 2024, 8:54 AM · Patch-For-Review, DBA

Jul 4 2024

ABran-WMF renamed T369295: Cookbook to create a sanitarium host from scratch from Cookbook to create a sanitarium master from scratch to Cookbook to create a sanitarium host from scratch.
Jul 4 2024, 2:44 PM · DBA
fnegri awarded T369295: Cookbook to create a sanitarium host from scratch a Unicorn! token.
Jul 4 2024, 2:20 PM · DBA
ABran-WMF changed the subtype of T369295: Cookbook to create a sanitarium host from scratch from "Feature Request" to "Task".
Jul 4 2024, 2:19 PM · DBA
ABran-WMF changed the status of T369295: Cookbook to create a sanitarium host from scratch from Open to In Progress.
Jul 4 2024, 2:18 PM · DBA
ABran-WMF changed the status of T369295: Cookbook to create a sanitarium host from scratch, a subtask of T362893: Spicerack support for dbctl, from Open to In Progress.
Jul 4 2024, 2:18 PM · Patch-For-Review, DBA, Infrastructure-Foundations, conftool, SRE-tools, Spicerack
ABran-WMF added a parent task for T369295: Cookbook to create a sanitarium host from scratch: T362893: Spicerack support for dbctl.
Jul 4 2024, 2:18 PM · DBA
ABran-WMF added a subtask for T362893: Spicerack support for dbctl: T369295: Cookbook to create a sanitarium host from scratch.
Jul 4 2024, 2:18 PM · Patch-For-Review, DBA, Infrastructure-Foundations, conftool, Spicerack, SRE-tools
ABran-WMF created T369295: Cookbook to create a sanitarium host from scratch.
Jul 4 2024, 2:17 PM · DBA
ABran-WMF added a comment to T368354: Modify db-mysql to connect to an-redacteddb1001 from cumin hosts.

I'm affraid thats an answer I don't have @BTullis maybe @Marostegui or @Ladsgroup knows.

Jul 4 2024, 2:11 PM · Patch-For-Review, Data-Services, Data-Persistence, Data-Platform-SRE (2024.06.17 - 2024.07.07)
ABran-WMF added a comment to T368354: Modify db-mysql to connect to an-redacteddb1001 from cumin hosts.

Amazing 🎉 lets maye try the first deployment from a canary cumin host so we're 100% sure that there is no breaking change.

Jul 4 2024, 12:27 PM · Patch-For-Review, Data-Services, Data-Persistence, Data-Platform-SRE (2024.06.17 - 2024.07.07)
ABran-WMF added a comment to T368354: Modify db-mysql to connect to an-redacteddb1001 from cumin hosts.

Ah @BTullis I see the issue you face, I had the same one, sorry for not spotting it sooner!

Jul 4 2024, 10:06 AM · Patch-For-Review, Data-Services, Data-Persistence, Data-Platform-SRE (2024.06.17 - 2024.07.07)
Ladsgroup awarded T369250: db1213 InnoDB errors a Heartbreak token.
Jul 4 2024, 8:47 AM · DBA
ABran-WMF triaged T369252: monitoring - MariaDB log parsing and log alerting as Medium priority.
Jul 4 2024, 7:44 AM · DBA
ABran-WMF moved T369252: monitoring - MariaDB log parsing and log alerting from Triage to Ready on the DBA board.
Jul 4 2024, 7:29 AM · DBA
ABran-WMF created T369252: monitoring - MariaDB log parsing and log alerting.
Jul 4 2024, 7:28 AM · DBA
ABran-WMF changed the status of T369250: db1213 InnoDB errors from Open to In Progress.
Jul 4 2024, 7:17 AM · DBA
ABran-WMF created T369250: db1213 InnoDB errors.
Jul 4 2024, 7:16 AM · DBA