Opened 4 years ago

Closed 4 years ago

#199 closed (fixed)

Update cylc to 6.5 on accessdev

Reported by: Scott Wales Owned by: Martin Dix
Priority: major Component: ACCESS model
Keywords: TIWG Cc:

Description


Change History (13)

comment:1 Changed 4 years ago by Martin Dix

Hilary's announcement https://groups.google.com/forum/#!topic/cylc-dev/wmKHcOztWmg says that 6.6 isn't far off so it might be better to wait.

comment:2 Changed 4 years ago by Martin Dix

Created a branch mrd599/cylc6p6 that installs cylc 6.6.0, rose 2015.08.0, fcm 2015.08.0.

Clean reboot of accessdev-test gives errors relating to pip

Notice: /Stage[main]/Pkg::Python::Pip/Package[python-pip]/ensure: created
Error: Could not update: Could not locate the pip command.
Error: /Stage[main]/Rose::Dependencies/Package[requests]/ensure: change from absent to 2.5.1 failed: Could not update: Could not locate the pip command.
Error: Could not set 'present' on ensure: Could not locate the pip command. at 11:/etc/puppet/modules/trac/manifests/packages.pp
Error: Could not set 'present' on ensure: Could not locate the pip command. at 11:/etc/puppet/modules/trac/manifests/packages.pp

etc

and many python packages don't get installed. Same error when running a puppet apply. However pip seems to be installed properly.

Boot command was

./tools/nova-boot --name accessdev-test.nci.org.au --ip 130.56.244.73 --repo git@repos.nci.org.au:p/access.dev/puppet --branch mrd599/cylc6p6 --install-updates   --  --image centos-6.7-20150816 --flavor m1.small.2 --key-name mrd599 --security-groups ssh,http,umui,ping

comment:3 Changed 4 years ago by Scott Wales

This is due to a change in the pip package which means Puppet looks for the executable in the wrong place - https://tickets.puppetlabs.com/browse/PUP-3829

Try creating a resource like:

file { '/usr/bin/pip-python':
   ensure => 'link',
   target => '/usr/bin/pip',
   require => Package["python-pip"],
   before => Package<|provider=pip|>,
}

which should create a symlink to the file puppet's looking for

comment:4 Changed 4 years ago by Martin Dix

For now just created a link by hand and everything gets installed properly.

Testing shows a cylc issue https://github.com/cylc/cylc/issues/1582

comment:5 Changed 4 years ago by Martin Dix

Testing new version of fcm

7.3 standard job vabga

The UMUI generates a script with module load fcm/2014.06.0. Using the new fcm (2015.08.0) on accessdev-test and this module on raijin works ok. The bld.cfg and ext.cfg files created with current accessdev default (2014.11.0) and 2015.08.0 are identical.

Other testing

vn10.2 rose stem tests run ok with new utilities.

comment:6 Changed 4 years ago by Scott Wales

Owner: set to Martin Dix
Status: newassigned

Martin has tested updated rose, cylc & fcm, will install to accessdev proper

comment:7 Changed 4 years ago by Martin Dix

cylc 6.7.0 has been released.

This includes a change that was supposed to make the remote command compatible with other shells, https://github.com/cylc/cylc/commit/c2c5adc4c5f9f6fd7e9a9c532ee3d857da7a7913.

Effectively this means that cylc on raijin now runs a command like

ssh user@accessdev-test.nci.org.au env CYLC_VERSION=6.7.0 /usr/local/cylc/cylc-6.7.0/bin/cylc message "started"

whereas previously it was

ssh user@accessdev-test.nci.org.au CYLC_VERSION=6.6.1 /usr/local/cylc/cylc-6.6.1/bin/cylc message "started"

The NCI remote-job-submission script rejects this because env isn't on the allowed list.

As a short-term fix, changing the raijin lib/cylc/remote.py to remove the extra env works. However it's also necessary to strip off the -dirty suffix that gets appended to the version string. git diff gives

diff --git a/lib/cylc/remote.py b/lib/cylc/remote.py
index 6669e4f..668ed86 100644
--- a/lib/cylc/remote.py
+++ b/lib/cylc/remote.py
@@ -109,7 +109,8 @@ class remrun(object):
                 "use login shell", self.host, self.owner)
 
         # Pass cylc version through.
-        command += ["env", "CYLC_VERSION=%s" % CYLC_VERSION]
+        # dirty suffix causes problems with the remote shell permissions
+        command += ["CYLC_VERSION=%s" % CYLC_VERSION.replace('-dirty','')]
 
         if ssh_login_shell:
             # A login shell will always source /etc/profile and the user's bash

Last edited 4 years ago by Martin Dix (previous) (diff)

comment:8 Changed 4 years ago by Martin Dix

Now also rose 2015.10.0 and fcm 2015.09.0.

Testing shows a new cylc bug, https://github.com/cylc/cylc/issues/1628

With this patched on raijin things work properly (see Admin Guides/UpdatingRoseCylc for testing description).

comment:9 Changed 4 years ago by Martin Dix

Cylc 6.7.1 fixes issue mentioned above.

rose 2015.10.1 and fcm 2015.10.0 have been released.

comment:10 Changed 4 years ago by Martin Dix

Just when I thought this was finally done -
https://github.com/metomi/rose/issues/1737

For now I've changed /usr/local/rose/2015.10.1/bin/rosie-go to just print a message suggesting using the 2015.04.1 version.

comment:11 Changed 4 years ago by Martin Dix

Rose 2015.11.0 and cylc 6.7.2 installed (branch mrd599/cylc-6.7.2)

Also modified rose configuration to fix #254.

comment:12 Changed 4 years ago by Martin Dix

From the rose 2015.11.0 release notes

rose suite-hook: will no longer retrieve remote job logs by default, this can now be handled by cylc. If this functionality is still required for whatever reason, use the --retrieve-job-logs option.

For backwards compatibility added this option to rose-task-hook2.

comment:13 Changed 4 years ago by Martin Dix

Resolution: fixed
Status: assignedclosed

All seems to be working ok. Currently running suites (excluding those that haven't done anything for a week)

Version No. running
6.4.1 5
6.7.1 2
6.7.2 13

The Met Office vn10.3 release notes (https://code.metoffice.gov.uk/trac/um/wiki/ReleaseNotes10.3) say The rose-stem suite at UM 10.3 is intended for use with ​FCM 2015.05.0, ​Rose 2015.06.0 and ​Cylc 6.4.1 so we're actually ahead.

Closing the ticket before another new version gets released.

Note: See TracTickets for help on using tickets.