Page MenuHomePhabricator

Remove or replace deployment-snapshot03.deployment-prep.eqiad1.wikimedia.cloud (Buster deprecation)
Closed, ResolvedPublic

Description

Debian Buster is well out of upstream support and all Buster VMs need to be replaced.

Event Timeline

@BTullis I see you have created deployment-snapshot05 (Bullseye), although this new host was not part of https://gerrit.wikimedia.org/r/c/operations/dumps/scap/+/1008451, and neither is it part of the mediawiki-installation dsh group. Do we have to add snapshot05 to your 'scap' repository, as well or is it fine to just add it to the dsh group?

Gehel triaged this task as High priority.Jul 25 2024, 7:21 PM

Change #1059891 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/dumps/scap@master] Update the beta cluster scap targets for dumps

https://gerrit.wikimedia.org/r/1059891

Change #1059893 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Update the mediawiki-installation dsh group with new beta snapshot host

https://gerrit.wikimedia.org/r/1059893

Thanks @Southparkfan and @Andrew for the ping and apologies for the delay in responding.

I have pushed patches to update both the dumps/scap repository and the mediawiki-installation dsh group.

I think that once these are both merged we will be able to decommisison deployment-snapshot03.

Change #1059891 merged by Btullis:

[operations/dumps/scap@master] Update the beta cluster scap targets for dumps

https://gerrit.wikimedia.org/r/1059891

Change #1059893 merged by Btullis:

[operations/puppet@production] Update the mediawiki-installation dsh group with new beta snapshot host

https://gerrit.wikimedia.org/r/1059893

I deployed the updated dumps to deployment-snapshot05.deployment-prep.eqiad1.wikimedia.cloud with:

btullis@deployment-deploy04:/srv/deployment/dumps/dumps$ scap deploy
13:25:25 Started deploy [dumps/dumps@0d1f9be]
13:25:25 Deploying Rev: HEAD = 0d1f9be3610716a30b97df2ca671cc246c62c8f2
13:25:25 Started deploy [dumps/dumps@0d1f9be]: (no justification provided)
13:25:25 
== DEFAULT ==
:* deployment-snapshot05.deployment-prep.eqiad1.wikimedia.cloud
13:25:26 dumps/dumps: fetch stage(s): 100% (in-flight: 0; ok: 1; fail: 0; left: 0) |
13:25:27 dumps/dumps: config_deploy stage(s): 100% (in-flight: 0; ok: 1; fail: 0; left: 0) |
13:25:28 dumps/dumps: promote stage(s): 100% (in-flight: 0; ok: 1; fail: 0; left: 0) |
13:25:28 default deploy successful
13:25:28 
== DEFAULT ==
:* deployment-snapshot05.deployment-prep.eqiad1.wikimedia.cloud
13:25:28 dumps/dumps: finalize stage(s): 100% (in-flight: 0; ok: 1; fail: 0; left: 0) |
13:25:28 default deploy successful
13:25:28 Finished deploy [dumps/dumps@0d1f9be]: (no justification provided) (duration: 00m 03s)
13:25:28 Finished deploy [dumps/dumps@0d1f9be] (duration: 00m 03s)

deployment-snapshot03.deployment-prep.eqiad1.wikimedia.cloud has also been removed from the mediawiki-install dsh group, so I believe that we are clear to decommission it.

hashar reopened this task as Open.EditedAug 7 2024, 2:27 PM
hashar subscribed.

The deployment-snapshot03 is still referred to in dsh group which causes scap to fails:

grep -R deployment-snapshot /etc/dsh
/etc/dsh/group/mediawiki-installation:deployment-snapshot03.deployment-prep.eqiad1.wikimedia.cloud
/etc/dsh/group/scap_targets:deployment-snapshot03.deployment-prep.eqiad1.wikimedia.cloud
/etc/dsh/group/scap_targets:deployment-snapshot05.deployment-prep.eqiad1.wikimedia.cloud

And my guess is MediaWiki is not deployed/update on the new deployment-snapshot05 since it is not in the mediawiki-installation group.

The deployment-snapshot03 is still referred to in dsh group which causes scap to fails :)

Oh dear, sorry. I thought I remved that reference in https://gerrit.wikimedia.org/r/c/operations/puppet/+/1059893/2/hieradata/cloud/eqiad1/deployment-prep/common.yaml

I'll search for any other references now.

I have fixed the Puppet server ( T371982 ), I then ran Puppet on deployment-deploy04 which caught up with 2+ weeks of updates.

--- /etc/dsh/group/mediawiki-installation	2024-07-22 15:08:02.655659756 +0000
+++ /tmp/puppet-file20240807-1692853-ttna68	2024-08-07 15:13:55.490161364 +0000
@@ -9,5 +9,5 @@
 deployment-mediawiki14.deployment-prep.eqiad1.wikimedia.cloud
 deployment-mwmaint03.deployment-prep.eqiad1.wikimedia.cloud
 deployment-parsoid14.deployment-prep.eqiad1.wikimedia.cloud
-deployment-snapshot03.deployment-prep.eqiad1.wikimedia.cloud
+deployment-snapshot05.deployment-prep.eqiad1.wikimedia.cloud

And:

$ grep -R deployment-snapshot /etc/dsh
/etc/dsh/group/mediawiki-installation:deployment-snapshot05.deployment-prep.eqiad1.wikimedia.cloud
/etc/dsh/group/scap_targets:deployment-snapshot03.deployment-prep.eqiad1.wikimedia.cloud
/etc/dsh/group/scap_targets:deployment-snapshot05.deployment-prep.eqiad1.wikimedia.cloud

deployment-snapshot03 is still in scap_targets but I am choosing to ignore that.

The https://integration.wikimedia.org/ci/job/beta-scap-sync-world/ job is passing again ;)

The deployment-snapshot03 is still referred to in dsh group which causes scap to fails :)

Oh dear, sorry. I thought I remved that reference in https://gerrit.wikimedia.org/r/c/operations/puppet/+/1059893/2/hieradata/cloud/eqiad1/deployment-prep/common.yaml

I'll search for any other references now.

@BTullis , I forgot to say yesterday that your comment is what prompted me to verify whether Puppet was properly working which ultimately led to the resolution Thank you! :)