teuthology presented 2011-07-01 [email protected] image credit:
TRANSCRIPT
![Page 1: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/1.jpg)
Teuthology
Presented [email protected]
image credit: http://www.flickr.com/photos/peterblapps/3250800528/
![Page 2: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/2.jpg)
Ceph as in
CephalopodaMolluscaInvertebrae
TeuthologyMalacology
![Page 3: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/3.jpg)
Not your grandmother's software stack
![Page 4: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/4.jpg)
We tried Autotest
... and quickly discovered it's limitations
Currently at 15 independent patches, 24 files changed, 575 insertions(+), 19 deletions(-)
Realized Autotest's architecture is working against us.
We still use it for it's packaged "client side" tests, but not its multi-machine features.
![Page 5: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/5.jpg)
Multi-machine control
Python+ Paramiko (SSH)+ gevent= orchestra
Real-timeInteractiveCentral controllerFull SSH protocol (channels!)Not ChefNot Fabric
cluster = Cluster(...)cluster.run(...)cluster.only('x86').run(...)cluster.exclude('x86').run(...)
http://github.com/tv42/orchestra
![Page 6: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/6.jpg)
Teuthology is a test runner
Run tasks on targets as told to by roles.
AutomaticallySetupMonitor healthRun test(s)Archive resultsArchive logs, core dumps, etcClean up
http://github.com/tv42/teuthology
Read the README
![Page 7: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/7.jpg)
Run tasks on targets as told to by roles.
targets:- [email protected] [email protected] [email protected]
You need to have SSH working, without passphrases.
You need passphraseless sudo on the remote host.
YAML format:lists, dicts, strings, numbers.
![Page 8: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/8.jpg)
Run tasks on targets as told to by roles.
roles:- [mon.0, mds.0, osd.0]- [mon.1, osd.1]- [mon.2, client.0]
![Page 9: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/9.jpg)
Run tasks on targets as told to by roles.
roles:- [mon.0, mds.0, osd.0]- [mon.1, osd.1]- [mon.2, client.0]
targets:- [email protected] [email protected] ubuntu@sepiaZZ...
![Page 10: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/10.jpg)
Run tasks on targets as told to by roles.
tasks:- ceph:- kclient: [client.0]- autotest: client.0: [dbench]
![Page 11: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/11.jpg)
Interactive mode
tasks:- interactive:
INFO:teuthology.run_tasks:Running task interactive...Ceph test interactive mode, use ctx to interact with the cluster, press control-D to exit...>>> 1+12>>>
![Page 12: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/12.jpg)
Interactive mode
>>> ctx.cluster.only('osd.0').run(args=['uptime'])INFO:orchestra.run.out: 13:05:38 up 42 days, 23:17, 0 users, load average: 0.12, 0.09, 0.07[<orchestra.run.RemoteProcess object at 0x28bd110>]
One RemoteProcess per command run.
![Page 13: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/13.jpg)
Using just one Remote first
>>> (remote,) = ctx.cluster.only('osd.0').remotes.keys()>>> proc = remote.run(args=['echo', '*'])INFO:orchestra.run.out:*>>> proc<orchestra.run.RemoteProcess ...>>>> proc.command"echo '*'"
Shell quoting done for you.
Works like ctx.cluster.run.
Just one RemoteProcess, not a list.
![Page 14: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/14.jpg)
Failing processes
>>> remote.run(args=['bork'])INFO:orchestra.run.err:bash: bork: command not found...CommandFailedError: Command failed with status 127: 'bork'
>>> proc = remote.run(args=['bork'],... check_status=False)INFO:orchestra.run.err:bash: bork: command not found>>> proc.exitstatus127
![Page 15: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/15.jpg)
Concurrency
>>> proc = remote.run(args=['uptime'], wait=False)>>> proc<orchestra.run.RemoteProcess object at 0x28bd1d0>>>> proc.exitstatus<gevent.event.AsyncResult object at 0x28c2a10>
![Page 16: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/16.jpg)
Concurrency
>>> proc.exitstatus<gevent.event.AsyncResult object at 0x28c2a10>>>> import time; time.sleep(0)INFO:orchestra.run.out: 13:16:48 up 42 days, 23:28, 0 users, load average: 0.35, 0.15, 0.08>>> proc.exitstatus<gevent.event.AsyncResult object at 0x28c2a10>>>> proc.exitstatus.get()0
![Page 17: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/17.jpg)
Capturing stdout/stderr
>>> from orchestra import run>>> proc = remote.run(args=['uname', '-m'],... wait=False, stdout=run.PIPE)>>> proc.exitstatus<gevent.event.AsyncResult object at 0x28c2dd0>>>> proc.exitstatus.ready() # just for debugFalse>>> proc.stdout.read()'x86_64\n'>>> proc.exitstatus.get()0
![Page 18: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/18.jpg)
Deadlocks you must avoid:stdout vs stderrstdout/err vs stdinstdout/err vs exit
![Page 19: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/19.jpg)
Using Cluster
>>> processes = ctx.cluster.run(... args=['uname', '-m'],... wait=False,... stdout=run.PIPE)>>> processes[<orchestra.run.RemoteProcess object at 0x28bdbf0>, <orchestra.run.RemoteProcess object at 0x28bdb90>, <orchestra.run.RemoteProcess object at 0x28bdad0>]>>> [p.stdout.read() for p in processes]['x86_64\n', 'x86_64\n', 'x86_64\n']>>> run.wait(processes)>>>
![Page 20: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/20.jpg)
Controlling stdout/stderr logging
>>> import logging>>> log = logging.getLogger(__name__)>>> log.info('foo')INFO:__builtin__:foo>>> ctx.cluster.only('osd.0').run(... args=['uptime'],... logger=log.getChild('uptime'))INFO:__builtin__.uptime.out: 13:52:49 up 43 days, 4 min, 0 users, load average: 0.00, 0.01, 0.05[<orchestra.run.RemoteProcess object at 0x28bdb90>]>>>
Usually looks like teuthology.task.foo
![Page 21: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/21.jpg)
Tasks can be context managers
tasks:- ceph:- kclient: ...- autotest: ...- interactive:
![Page 22: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/22.jpg)
/tmp/cephtest
Must not exist already, or target is dirty (see teuthology-nuke, later)
Used by tasks to store things in
Tasks are responsible for cleaning up after themselves (no toplevel rm -rf, to flush out the bugs)
Anything in /tmp/cephtest/archive gets archived
Please bzip2 -9 any big files your task leaves in archive
![Page 23: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/23.jpg)
Cleanups & failures
Clean up can fail, further cleanups are still attempted -> always study the first error, not the last one.
If a task fails to clean up, the targets are left "dirty".
teuthology-nuke is a Big Hammer.
![Page 24: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/24.jpg)
Archived results
2011-06-21T10-00-44/├── ceph-sha1├── config.yaml├── remote│ ├── [email protected]│ │ ├── log│ │ │ ├── client.admin.log.bz2│ │ │ ├── mds.0.log.bz2│ │ │ ├── mon.0.log.bz2│ │ │ └── osd.0.log.bz2│ │ └── syslog│ │ ├── kern.log.bz2│ │ └── misc.log.bz2│ ├── [email protected] ...│ └── [email protected]│ ├── autotest│ │ └── ...│ ├── log ...│ └── syslog ...├── summary.yaml└── teuthology.log
![Page 25: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/25.jpg)
gitbuilder
A low-key low-hype continuous integration tool
Builds tags and heads of branches
On bad build, tries older commits until finds green
We have it building ceph and our kernel fork
http://ceph.newdream.net/gitbuilder/http://ceph.newdream.net/gitbuilder-i386/http://ceph.newdream.net/gitbuilder-gcov-amd64/http://ceph.newdream.net/gitbuilder-deb-amd64/http://ceph.newdream.net/gitbuilder-kernel-amd64/
![Page 26: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/26.jpg)
![Page 27: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/27.jpg)
We made gitbuilder create tarballs
http://ceph.newdream.net/gitbuilder/output/ref/origin_master/
Index of /output/ref/origin_master/mode links bytes last-changed name dr-x 2 4096 Jun 29 13:58 ./ dr-x 28 12288 Jun 29 15:16 ../ -r-- 1 149323650 Jun 29 13:58 ceph.x86_64.tgz -r-- 1 41 Jun 29 13:57 sha1
Don't trust the links, ProxyPass confuses the web server
Fetch .../output/origin_master/sha1, then fetch .../output/sha1/SHA1_HERE/ceph.x86_64.tgz
![Page 28: Teuthology Presented 2011-07-01 tommi.virtanen@dreamhost.com image credit:](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649cf75503460f949c72fa/html5/thumbnails/28.jpg)
Future and topics not covered
teuthology-suitenightly runsmachine allocationgcovflavorscustom ceph buildsinstalling custom kernelsfailure testingmonitor health