Page MenuHomePhabricator

Cloud-ServicesUmbrella
ActivePublic

Details

Description

Cloud-Services is the umbrella project for tasks related to the products managed by the Wikimedia Cloud Services team. A general overview of services offered can be found at wikitech:Help:Cloud Services Introduction

Subprojects:

The team itself has a separate project at cloud-services-team that is used to track the backlog and current priorities of the team.

Recent Activity

Wed, Aug 7

Magnus closed T371955: PetScan not responding as Resolved.

Works again

Wed, Aug 7, 9:00 AM · Cloud-VPS
Magnus triaged T371955: PetScan not responding as Unbreak Now! priority.
Wed, Aug 7, 8:41 AM · Cloud-VPS
Magnus created T371955: PetScan not responding.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Wed, Aug 7, 8:40 AM · Cloud-VPS

Mon, Aug 5

thcipriani added a comment to T370080: Moving proxies across wmcs projects for patchdemo.wmflabs.org.

Current status:

Mon, Aug 5, 4:35 PM · Catalyst (PatchDemo GoLive), Cloud-Services
thcipriani edited projects for T370080: Moving proxies across wmcs projects for patchdemo.wmflabs.org, added: Catalyst (PatchDemo GoLive); removed Catalyst.
Mon, Aug 5, 4:13 PM · Catalyst (PatchDemo GoLive), Cloud-Services

Thu, Aug 1

gerritbot added a comment to T371573: puppet problems mounting cinder volumes (and suggested fixes).

Change #1056606 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] cinderutils: add --allow-unattended-format when preparing volumes

https://gerrit.wikimedia.org/r/1056606

Thu, Aug 1, 1:52 AM · cloud-services-team, collaboration-services, Cloud-VPS, Patch-For-Review
gerritbot added a project to T371573: puppet problems mounting cinder volumes (and suggested fixes): Patch-For-Review.
Thu, Aug 1, 1:51 AM · cloud-services-team, collaboration-services, Cloud-VPS, Patch-For-Review
gerritbot added a comment to T371573: puppet problems mounting cinder volumes (and suggested fixes).

Change #1057000 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] cinderutils: allow floating point numbers for min_gb and max_gb

https://gerrit.wikimedia.org/r/1057000

Thu, Aug 1, 1:51 AM · cloud-services-team, collaboration-services, Cloud-VPS, Patch-For-Review
Dzahn created T371573: puppet problems mounting cinder volumes (and suggested fixes).

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Thu, Aug 1, 1:51 AM · cloud-services-team, collaboration-services, Cloud-VPS, Patch-For-Review

Jul 16 2024

matmarex added a comment to T370080: Moving proxies across wmcs projects for patchdemo.wmflabs.org.

I was going to use the migration between the Cloud VPS projects to also migrate from .wmflabs.org to .wmcloud.org. I've already set up a proxy using the new domain in the catalyst project. The old proxy can remain under the visualeditor project until we're ready to delete it (which should make it automatically redirect to the new one, according to https://wikitech.wikimedia.org/wiki/Help:Using_a_web_proxy_to_reach_Cloud_VPS_servers_from_the_internet#Migrate_from_a_*.wmflabs.org_proxy_to_a_*.wmcloud.org_proxy, but I'm a bit scared to test that since I can't undo the deletion).

Jul 16 2024, 4:43 PM · Catalyst (PatchDemo GoLive), Cloud-Services

Jul 15 2024

thcipriani moved T370080: Moving proxies across wmcs projects for patchdemo.wmflabs.org from Backlog to radar on the Catalyst board.
Jul 15 2024, 4:49 PM · Catalyst (PatchDemo GoLive), Cloud-Services
thcipriani created T370080: Moving proxies across wmcs projects for patchdemo.wmflabs.org.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Jul 15 2024, 4:48 PM · Catalyst (PatchDemo GoLive), Cloud-Services

Jul 9 2024

Maintenance_bot removed a project from T167204: disable service groups for non-tools projects: Patch-For-Review.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Jul 9 2024, 8:49 PM · cloud-services-team, MW-1.31-release-notes (WMF-deploy-2018-03-06 (1.31.0-wmf.24)), Cloud-Services

Jul 3 2024

joanna_borun created T369150: Analysis and metrics collection for quarry and superset adoption.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Jul 3 2024, 10:31 AM · Quarry, superset.wmcloud.org

Jun 28 2024

sguebo_WMF closed Restricted Task, a subtask of T50930: Database replication problems - production and labs (tracking), as Resolved.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Jun 28 2024, 2:41 PM · Data-Services, SRE, DBA, Tracking-Neverending

Jun 24 2024

Matanya created T368330: Remove matanya as an admin from VPS projects.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Jun 24 2024, 10:48 PM · Cloud-VPS
Benoit74 created T368265: Disk volumes of cloud instances are completely mixed-up.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Jun 24 2024, 12:57 PM · Cloud-VPS, Cloud-Services-Origin-User, affects-Kiwix-and-openZIM
Benoit74 edited projects for T348226: Read-only access to Wikimedia mirror of Kiwix data in dumps.wikimedia.org/kiwix/, added: Cloud-Services, Cloud-Services-Origin-User; removed cloud-services-team.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Jun 24 2024, 12:57 PM · Data-Services, affects-Kiwix-and-openZIM
Maintenance_bot removed a project from T273950: Modernise memcached systemd unit / sync, and make it presentable: Patch-For-Review.
Jun 24 2024, 11:31 AM · Cloud-Services, serviceops, User-jijiki, SRE
gerritbot added a comment to T273950: Modernise memcached systemd unit / sync, and make it presentable.

Change #1039229 abandoned by Muehlenhoff:

[operations/puppet@production] Configure memcached on idp hosts to run as 'memcache'

Reason:

CAS 7.0 removed the memcached backend, no longer needed

https://gerrit.wikimedia.org/r/1039229

Jun 24 2024, 11:25 AM · Cloud-Services, serviceops, User-jijiki, SRE
MoritzMuehlenhoff updated the task description for T273950: Modernise memcached systemd unit / sync, and make it presentable.
Jun 24 2024, 11:25 AM · Cloud-Services, serviceops, User-jijiki, SRE
MoritzMuehlenhoff added a comment to T273950: Modernise memcached systemd unit / sync, and make it presentable.

CAS 7.0 (what we are currently migrating to) removed the memcached backend. As such, this change won't be needed anymore for the idp servers, I'll tick them off.

Jun 24 2024, 11:25 AM · Cloud-Services, serviceops, User-jijiki, SRE

Jun 13 2024

dcaro merged task T367349: Fix HA proxy load-balancer health check monitor to not poll nodes where the API is not responding into T367389: [k8s,infra,alerting] improve HAproxy and k8s apiserver interaction.
Jun 13 2024, 9:50 AM · Cloud-Services, cloud-services-team, Sustainability (Incident Followup)

Jun 12 2024

Andrew created T367349: Fix HA proxy load-balancer health check monitor to not poll nodes where the API is not responding.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Jun 12 2024, 6:53 PM · Cloud-Services, cloud-services-team, Sustainability (Incident Followup)
Andrew added a subtask for T367348: Incident: 2024-06-12 toolforge k8s control plane: T333934: [k8s,infra] scale up coredns replicas.
Jun 12 2024, 6:52 PM · User-aborrero, Toolforge, cloud-services-team, Sustainability (Incident Followup)
Andrew created T367348: Incident: 2024-06-12 toolforge k8s control plane.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Jun 12 2024, 6:51 PM · User-aborrero, Toolforge, cloud-services-team, Sustainability (Incident Followup)

Jun 11 2024

taavi closed T103874: Update Vagrant role for Extension:OpenStackManager, a subtask of T97334: Grant shell user right with project memberships and remove autocreation of shell requests, as Declined.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Jun 11 2024, 6:25 PM · WMF-deploy-2015-06-23_(1.26wmf11), WMF-deploy-2015-06-16_(1.26wmf10), WMF-deploy-2015-06-30_(1.26wmf12), wikitech.wikimedia.org

Jun 9 2024

bd808 closed T163340: Judy PECL module not available in Tool labs, need to install it as Declined.

Grid is gone. Installing pecl packages like this should be possible with the php buildpack.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Jun 9 2024, 12:18 AM · Cloud-Services, Toolforge

Jun 5 2024

MoritzMuehlenhoff updated the task description for T273950: Modernise memcached systemd unit / sync, and make it presentable.
Jun 5 2024, 2:33 PM · Cloud-Services, serviceops, User-jijiki, SRE
gerritbot added a comment to T273950: Modernise memcached systemd unit / sync, and make it presentable.

Change #1039229 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Configure memcached on idp hosts to run as 'memcache'

https://gerrit.wikimedia.org/r/1039229

Jun 5 2024, 2:32 PM · Cloud-Services, serviceops, User-jijiki, SRE
gerritbot added a comment to T273950: Modernise memcached systemd unit / sync, and make it presentable.

Change #1039226 merged by Muehlenhoff:

[operations/puppet@production] Fix Hiera option name

https://gerrit.wikimedia.org/r/1039226

Jun 5 2024, 2:23 PM · Cloud-Services, serviceops, User-jijiki, SRE
gerritbot added a comment to T273950: Modernise memcached systemd unit / sync, and make it presentable.

Change #1039226 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Fix Hiera option name

https://gerrit.wikimedia.org/r/1039226

Jun 5 2024, 2:15 PM · Cloud-Services, serviceops, User-jijiki, SRE
gerritbot added a comment to T273950: Modernise memcached systemd unit / sync, and make it presentable.

Change #1039206 merged by Muehlenhoff:

[operations/puppet@production] Configure memcached on idp-test hosts to run as 'memcache'

https://gerrit.wikimedia.org/r/1039206

Jun 5 2024, 1:56 PM · Cloud-Services, serviceops, User-jijiki, SRE
gerritbot added a project to T273950: Modernise memcached systemd unit / sync, and make it presentable: Patch-For-Review.
Jun 5 2024, 12:35 PM · Cloud-Services, serviceops, User-jijiki, SRE
gerritbot added a comment to T273950: Modernise memcached systemd unit / sync, and make it presentable.

Change #1039206 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Configure memcached on idp-test hosts to run as 'memcache'

https://gerrit.wikimedia.org/r/1039206

Jun 5 2024, 12:34 PM · Cloud-Services, serviceops, User-jijiki, SRE

Jun 4 2024

jijiki changed the status of T273950: Modernise memcached systemd unit / sync, and make it presentable from Open to In Progress.
Jun 4 2024, 3:13 PM · Cloud-Services, serviceops, User-jijiki, SRE
jijiki updated the task description for T273950: Modernise memcached systemd unit / sync, and make it presentable.
Jun 4 2024, 3:13 PM · Cloud-Services, serviceops, User-jijiki, SRE
jijiki updated the task description for T273950: Modernise memcached systemd unit / sync, and make it presentable.
Jun 4 2024, 9:18 AM · Cloud-Services, serviceops, User-jijiki, SRE
jijiki updated the task description for T273950: Modernise memcached systemd unit / sync, and make it presentable.
Jun 4 2024, 9:05 AM · Cloud-Services, serviceops, User-jijiki, SRE
jijiki updated the task description for T273950: Modernise memcached systemd unit / sync, and make it presentable.
Jun 4 2024, 9:03 AM · Cloud-Services, serviceops, User-jijiki, SRE
jijiki updated the task description for T273950: Modernise memcached systemd unit / sync, and make it presentable.
Jun 4 2024, 8:45 AM · Cloud-Services, serviceops, User-jijiki, SRE
jijiki changed the status of T273950: Modernise memcached systemd unit / sync, and make it presentable from In Progress to Open.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

Jun 4 2024, 8:44 AM · Cloud-Services, serviceops, User-jijiki, SRE

May 21 2024

Dzahn added a project to T364060: Degraded RAID on cloudcephosd1031: Cloud-Services.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

May 21 2024, 7:23 PM · Cloud-VPS, Cloud-Services-Origin-Alert, cloud-services-team, DC-Ops, SRE, ops-eqiad

May 7 2024

dcaro closed T364376: cloudcephosd: the service unit user@0.service is in failed status as Resolved.

@MoritzMuehlenhoff mentioned on irc that this is probably related to the upgrade of glibc + systemd issues under load, should be very infrequent.

May 7 2024, 11:47 AM · Cloud-Services, cloud-services-team
dcaro merged T364381: SystemdUnitDown into T364376: cloudcephosd: the service unit user@0.service is in failed status.
May 7 2024, 11:43 AM · Cloud-Services, cloud-services-team
dcaro added a comment to T364376: cloudcephosd: the service unit user@0.service is in failed status.

It might be related to T199911: Systemd session creation fails under I/O load

May 7 2024, 10:47 AM · Cloud-Services, cloud-services-team
dcaro added a comment to T364376: cloudcephosd: the service unit user@0.service is in failed status.

The sessions seems to be a root login from cumin2002 (for all nodes):

May 7 2024, 9:47 AM · Cloud-Services, cloud-services-team
aborrero triaged T364376: cloudcephosd: the service unit user@0.service is in failed status as Low priority.
May 7 2024, 9:47 AM · Cloud-Services, cloud-services-team
aborrero edited projects for T364376: cloudcephosd: the service unit user@0.service is in failed status, added: Cloud-Services; removed Cloud-VPS.
May 7 2024, 9:46 AM · Cloud-Services, cloud-services-team
aborrero renamed T364376: cloudcephosd: the service unit user@0.service is in failed status from cloudcephosd: user@0.service is in failed status to cloudcephosd: the service unit user@0.service is in failed status.
May 7 2024, 9:46 AM · Cloud-Services, cloud-services-team