Codebase list ruby-god / debian/0.13.2-1 doc / god.asciidoc
debian/0.13.2-1

Tree @debian/0.13.2-1 (Download .tar.gz)

god.asciidoc @debian/0.13.2-1raw · history · blame

   1
   2
   3
   4
   5
   6
   7
   8
   9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95
  96
  97
  98
  99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
Installation
------------

The best way to get god is via rubygems:

```terminal
$ [sudo] gem install god
```

Requirements
------------

God currently only works on *Linux (kernel 2.6.15+), BSD,* and *Darwin*
systems. No support for Windows is planned. Event based conditions on Linux
systems require the `cn` (connector) kernel module loaded or compiled into
the kernel and god must be run as root.

The following systems have been tested. Help us test it on others!

* Darwin 10.4.10
* RedHat Fedora 6-15
* Ubuntu Dapper (no events)
* Ubuntu Feisty
* CentOS 4.5 (no events), 5, 6


Quick Start
-----------

Note: this quick start guide requires god 0.12.0 or above. You can check your
version by running:

```terminal
$ god --version
```

The easiest way to understand how god will make your life better is by trying
out a simple example. To get you up and running quickly, I'll show you how to
keep a trivial server running.

Open up a new directory and write a simple server. Let's call it
`simple.rb`:

```ruby
loop do
  puts 'Hello'
  sleep 1
end
```

Now we'll write a god config file that tells god about our process. Place it
in the same directory and call it `simple.god`:

```ruby
God.watch do |w|
  w.name = "simple"
  w.start = "ruby /full/path/to/simple.rb"
  w.keepalive
end
```

This is the simplest possible god configuration. We start by declaring a
`God.watch` block.  A watch in god represents a process that we want to watch
and control. Each watch must have, at minimum, a unique name and a command that
tells god how to start the process. The `keepalive` declaration tells god to
keep this process alive. If the process is not running when god starts, it will
be started. If the process dies, it will be restarted.

In this example the `simple` process runs foreground, so god will take care of
daemonizing it and keeping track of the PID for us. When possible, it's best to
let god daemonize processes for us, that way we don't have to worry about
specifying and keeping track of PID files. Later on we'll see how to manage
processes that can't run foreground or that require PID files to be specified.

To run god, we give it the configuration file we wrote with `-c`. To see what's
going on, we can ask it to run foreground with `-D`:

```terminal
$ god -c path/to/simple.god -D
```

There are two ways that god can monitor your process. The first and better way
is with process events. Not every system supports it, but those that do will
automatically use it. With events, god will know immediately when a process
exits. For those systems without process event support, god will use a polling
mechanism. The output you see throughout this section will show both ways.

After starting god, you should see some output like the following:

```terminal
# Events

I [2011-12-10 15:24:34]  INFO: Loading simple.god
I [2011-12-10 15:24:34]  INFO: Syslog enabled.
I [2011-12-10 15:24:34]  INFO: Using pid file directory: /Users/tom/.god/pids
I [2011-12-10 15:24:34]  INFO: Started on drbunix:///tmp/god.17165.sock
I [2011-12-10 15:24:34]  INFO: simple move 'unmonitored' to 'init'
I [2011-12-10 15:24:34]  INFO: simple moved 'unmonitored' to 'init'
I [2011-12-10 15:24:34]  INFO: simple [trigger] process is not running (ProcessRunning)
I [2011-12-10 15:24:34]  INFO: simple move 'init' to 'start'
I [2011-12-10 15:24:34]  INFO: simple start: ruby /Users/tom/dev/mojombo/god/simple.rb
I [2011-12-10 15:24:34]  INFO: simple moved 'init' to 'start'
I [2011-12-10 15:24:34]  INFO: simple [trigger] process is running (ProcessRunning)
I [2011-12-10 15:24:34]  INFO: simple move 'start' to 'up'
I [2011-12-10 15:24:34]  INFO: simple registered 'proc_exit' event for pid 23298
I [2011-12-10 15:24:34]  INFO: simple moved 'start' to 'up'

# Polls

I [2011-12-07 09:40:18]  INFO: Loading simple.god
I [2011-12-07 09:40:18]  INFO: Syslog enabled.
I [2011-12-07 09:40:18]  INFO: Using pid file directory: /Users/tom/.god/pids
I [2011-12-07 09:40:18]  INFO: Started on drbunix:///tmp/god.17165.sock
I [2011-12-07 09:40:18]  INFO: simple move 'unmonitored' to 'up'
I [2011-12-07 09:40:18]  INFO: simple moved 'unmonitored' to 'up'
I [2011-12-07 09:40:18]  INFO: simple [trigger] process is not running (ProcessRunning)
I [2011-12-07 09:40:18]  INFO: simple move 'up' to 'start'
I [2011-12-07 09:40:18]  INFO: simple start: ruby /Users/tom/dev/mojombo/god/simple.rb
I [2011-12-07 09:40:19]  INFO: simple moved 'up' to 'up'
I [2011-12-07 09:40:19]  INFO: simple [ok] process is running (ProcessRunning)
I [2011-12-07 09:40:24]  INFO: simple [ok] process is running (ProcessRunning)
I [2011-12-07 09:40:29]  INFO: simple [ok] process is running (ProcessRunning)
```

Here you can see god starting up, noticing that the `simple` process isn't
running, starting it, and then checking every five seconds to make sure it's
up. If you'd like to see god work its magic, go ahead and kill the `simple`
process. You should then see something like this:

```terminal
# Events

I [2011-12-10 15:33:38]  INFO: simple [trigger] process 23416 exited (ProcessExits)
I [2011-12-10 15:33:38]  INFO: simple move 'up' to 'start'
I [2011-12-10 15:33:38]  INFO: simple deregistered 'proc_exit' event for pid 23416
I [2011-12-10 15:33:38]  INFO: simple start: ruby /Users/tom/dev/mojombo/god/simple.rb
I [2011-12-10 15:33:38]  INFO: simple moved 'up' to 'start'
I [2011-12-10 15:33:38]  INFO: simple [trigger] process is running (ProcessRunning)
I [2011-12-10 15:33:38]  INFO: simple move 'start' to 'up'
I [2011-12-10 15:33:38]  INFO: simple registered 'proc_exit' event for pid 23601
I [2011-12-10 15:33:38]  INFO: simple moved 'start' to 'up'

# Polls

I [2011-12-07 09:54:59]  INFO: simple [ok] process is running (ProcessRunning)
I [2011-12-07 09:55:04]  INFO: simple [ok] process is running (ProcessRunning)
I [2011-12-07 09:55:09]  INFO: simple [trigger] process is not running (ProcessRunning)
I [2011-12-07 09:55:09]  INFO: simple move 'up' to 'start'
I [2011-12-07 09:55:09]  INFO: simple start: ruby /Users/tom/dev/mojombo/god/simple.rb
I [2011-12-07 09:55:09]  INFO: simple moved 'up' to 'up'
I [2011-12-07 09:55:09]  INFO: simple [ok] process is running (ProcessRunning)
I [2011-12-07 09:55:14]  INFO: simple [ok] process is running (ProcessRunning)
```

While keeping a process up is useful, it would be even better if we could make
sure our process was behaving well and restart it when resource utilization
exceeds our specifications. With a few additions, we can easily have our
process restarted when memory usage or CPU goes above certain limits. Edit
your `sample.god` config file to look like this:

```ruby
God.watch do |w|
  w.name = "simple"
  w.start = "ruby /full/path/to/simple.rb"
  w.keepalive(:memory_max => 150.megabytes,
              :cpu_max => 50.percent)
end
```

Here I've specified a `:memory_max` option to the `keepalive` command. Now if
the process memory usage goes above 150 megabytes, god will restart it.
Similarly, by setting the `:cpu_max`, god will restart my process if its CPU
usage goes over 50%. By default these properties will be checked every 30
seconds and will be acted upon if there is an overage for three out of any
five checks. This prevents the process from getting restarted for temporary
resource spikes.

To test this out, modify your `simple.rb` server script to introduce a memory
leak:

```ruby
data = ''
loop do
  puts 'Hello'
  100000.times { data << 'x' }
end
```

Ctrl-C out of the foregrounded god instance. Notice that your current `simple`
server will continue to run. Start god again with the same command as before.
Now instead of starting the `simple` process, it will notice that one is
already running and simply switch to the `up` state.

```terminal
# Events

I [2011-12-10 15:36:00]  INFO: Loading simple.god
I [2011-12-10 15:36:00]  INFO: Syslog enabled.
I [2011-12-10 15:36:00]  INFO: Using pid file directory: /Users/tom/.god/pids
I [2011-12-10 15:36:00]  INFO: Started on drbunix:///tmp/god.17165.sock
I [2011-12-10 15:36:00]  INFO: simple move 'unmonitored' to 'init'
I [2011-12-10 15:36:00]  INFO: simple moved 'unmonitored' to 'init'
I [2011-12-10 15:36:00]  INFO: simple [trigger] process is running (ProcessRunning)
I [2011-12-10 15:36:00]  INFO: simple move 'init' to 'up'
I [2011-12-10 15:36:00]  INFO: simple registered 'proc_exit' event for pid 23601
I [2011-12-10 15:36:00]  INFO: simple moved 'init' to 'up'

# Polls

I [2011-12-07 14:50:46]  INFO: Loading simple.god
I [2011-12-07 14:50:46]  INFO: Syslog enabled.
I [2011-12-07 14:50:46]  INFO: Using pid file directory: /Users/tom/.god/pids
I [2011-12-07 14:50:47]  INFO: Started on drbunix:///tmp/god.17165.sock
I [2011-12-07 14:50:47]  INFO: simple move 'unmonitored' to 'up'
I [2011-12-07 14:50:47]  INFO: simple moved 'unmonitored' to 'up'
I [2011-12-07 14:50:47]  INFO: simple [ok] process is running (ProcessRunning)
```

In order to get our new `simple` server running, we can issue a command to god
to have our process restarted:

```terminal
$ god restart simple
```

From the logs you can see god killing and restarting the process:

```terminal
# Events

I [2011-12-10 15:38:13]  INFO: simple move 'up' to 'restart'
I [2011-12-10 15:38:13]  INFO: simple deregistered 'proc_exit' event for pid 23601
I [2011-12-10 15:38:13]  INFO: simple stop: default lambda killer
I [2011-12-10 15:38:13]  INFO: simple sent SIGTERM
I [2011-12-10 15:38:14]  INFO: simple process stopped
I [2011-12-10 15:38:14]  INFO: simple start: ruby /Users/tom/dev/mojombo/god/simple.rb
I [2011-12-10 15:38:14]  INFO: simple moved 'up' to 'restart'
I [2011-12-10 15:38:14]  INFO: simple [trigger] process is running (ProcessRunning)
I [2011-12-10 15:38:14]  INFO: simple move 'restart' to 'up'
I [2011-12-10 15:38:14]  INFO: simple registered 'proc_exit' event for pid 23707
I [2011-12-10 15:38:14]  INFO: simple moved 'restart' to 'up'

# Polls

I [2011-12-07 14:51:13]  INFO: simple [ok] process is running (ProcessRunning)
I [2011-12-07 14:51:13]  INFO: simple move 'up' to 'restart'
I [2011-12-07 14:51:13]  INFO: simple stop: default lambda killer
I [2011-12-07 14:51:13]  INFO: simple sent SIGTERM
I [2011-12-07 14:51:14]  INFO: simple process stopped
I [2011-12-07 14:51:14]  INFO: simple start: ruby /Users/tom/dev/mojombo/god/simple.rb
I [2011-12-07 14:51:14]  INFO: simple moved 'up' to 'up'
I [2011-12-07 14:51:14]  INFO: simple [ok] process is running (ProcessRunning)
```

God will now start reporting on memory and CPU utilization of your process:

```terminal
# Events and Polls

I [2011-12-07 14:54:37]  INFO: simple [ok] process is running (ProcessRunning)
I [2011-12-07 14:54:37]  INFO: simple [ok] memory within bounds [2032kb] (MemoryUsage)
I [2011-12-07 14:54:37]  INFO: simple [ok] cpu within bounds [0.0%%] (CpuUsage)
I [2011-12-07 14:54:42]  INFO: simple [ok] process is running (ProcessRunning)
I [2011-12-07 14:54:42]  INFO: simple [ok] memory within bounds [2032kb, 13492kb] (MemoryUsage)
I [2011-12-07 14:54:42]  INFO: simple [ok] cpu within bounds [0.0%%, *99.7%%] (CpuUsage)
I [2011-12-07 14:54:47]  INFO: simple [ok] process is running (ProcessRunning)
I [2011-12-07 14:54:47]  INFO: simple [ok] memory within bounds [2032kb, 13492kb, 25568kb] (MemoryUsage)
I [2011-12-07 14:54:47]  INFO: simple [ok] cpu within bounds [0.0%%, *99.7%%, *100.0%%] (CpuUsage)
I [2011-12-07 14:54:52]  INFO: simple [ok] process is running (ProcessRunning)
I [2011-12-07 14:54:52]  INFO: simple [ok] memory within bounds [2032kb, 13492kb, 25568kb, 37556kb] (MemoryUsage)
I [2011-12-07 14:54:52]  INFO: simple [trigger] cpu out of bounds [0.0%%, *99.7%%, *100.0%%, *98.4%%] (CpuUsage)
I [2011-12-07 14:54:52]  INFO: simple move 'up' to 'restart'
```

On the last line of the above log you can see that CPU usage has gone above
50% for three cycles and god will issue a restart operation. God will continue
to monitor the `simple` process for as long as god is running and the process
is set to be monitored.

Now, before you kill the god process, let's kill the `simple` server by asking
god to stop it for us. In a new terminal, issue the command:

```terminal
$ god stop simple
```

You should see the following output:

```terminal
Sending 'stop' command

The following watches were affected:
  simple
```

And in the foregrounded god terminal window, you'll see the log of what
happened:

```terminal
# Events

I [2011-12-10 15:41:04]  INFO: simple stop: default lambda killer
I [2011-12-10 15:41:04]  INFO: simple sent SIGTERM
I [2011-12-10 15:41:05]  INFO: simple process stopped
I [2011-12-10 15:41:05]  INFO: simple move 'up' to 'unmonitored'
I [2011-12-10 15:41:05]  INFO: simple deregistered 'proc_exit' event for pid 23707
I [2011-12-10 15:41:05]  INFO: simple moved 'up' to 'unmonitored'

# Polls

I [2011-12-07 09:59:59]  INFO: simple [ok] process is running (ProcessRunning)
I [2011-12-07 10:00:04]  INFO: simple [ok] process is running (ProcessRunning)
I [2011-12-07 10:00:07]  INFO: simple stop: default lambda killer
I [2011-12-07 10:00:07]  INFO: simple sent SIGTERM
I [2011-12-07 10:00:08]  INFO: simple process stopped
I [2011-12-07 10:00:08]  INFO: simple move 'up' to 'unmonitored'
I [2011-12-07 10:00:08]  INFO: simple moved 'up' to 'unmonitored'
```

Now feel free to Ctrl-C out of god. Congratulations! You've just taken god for
a test ride and seen how easy it is to keep your processes running.

This is just the beginning of what god can do, and in reality, the `keepalive`
command is a convenience method written using more advanced transitional and
condition constructs that may be used directly. You can configure many
different kinds of conditions to have your process restarted when memory or
CPU are too high, when disk usage is above a threshold, when a process returns
an HTTP error code on a specific URL, and many more. In addition you can write
your own custom conditions and use them in your configuration files. Many
different lifecycle controls are available alongside a sophisticated and
extensible notifications system. Keep reading to find out what makes god
different from other monitoring systems and how it can help you solve many of
your process monitoring and control problems.


Config Files are Ruby Code!
---------------------------

Now that you've seen how to get started quickly, let's see how to use the more
powerful aspects of god. Once again, the best way to learn will be through an
example. The following configuration file is what I once used at gravatar.com
to keep the mongrels running:

```ruby
RAILS_ROOT = "/Users/tom/dev/gravatar2"

%w{8200 8201 8202}.each do |port|
  God.watch do |w|
    w.name = "gravatar2-mongrel-#{port}"

    w.start = "mongrel_rails start -c #{RAILS_ROOT} -p #{port} \
      -P #{RAILS_ROOT}/log/mongrel.#{port}.pid  -d"
    w.stop = "mongrel_rails stop -P #{RAILS_ROOT}/log/mongrel.#{port}.pid"
    w.restart = "mongrel_rails restart -P #{RAILS_ROOT}/log/mongrel.#{port}.pid"

    w.pid_file = File.join(RAILS_ROOT, "log/mongrel.#{port}.pid")

    w.behavior(:clean_pid_file)

    w.start_if do |start|
      start.condition(:process_running) do |c|
        c.interval = 5.seconds
        c.running = false
      end
    end

    w.restart_if do |restart|
      restart.condition(:memory_usage) do |c|
        c.above = 150.megabytes
        c.times = [3, 5] # 3 out of 5 intervals
      end

      restart.condition(:cpu_usage) do |c|
        c.above = 50.percent
        c.times = 5
      end
    end

    # lifecycle
    w.lifecycle do |on|
      on.condition(:flapping) do |c|
        c.to_state = [:start, :restart]
        c.times = 5
        c.within = 5.minute
        c.transition = :unmonitored
        c.retry_in = 10.minutes
        c.retry_times = 5
        c.retry_within = 2.hours
      end
    end
  end
end
```

That's a lot to take in at once, so I'll break it down by section and explain
what's going on in each.

```ruby
RAILS_ROOT = "/var/www/gravatar2/current"
```

Here I've set a constant that is used throughout the file. Keeping the
`RAILS_ROOT` value in a constant makes it easy to adapt this script to other
applications. Because the config file is Ruby code, I can set whatever
variables or constants I want that make the configuration more concise and
easier to work with.

```ruby
%w{8200 8201 8202}.each do |port|
  ...
end
```

Because the config file is written in actual Ruby code, we can construct loops
and do other intelligent things that are impossible in your every day, run of
the mill config file. I need to watch three mongrels, so I simply loop over
their port numbers, eliminating duplication and making my life a whole lot
easier.

```ruby
  God.watch do |w|
    w.name = "gravatar2-mongrel-#{port}"

    w.start = "mongrel_rails start -c #{RAILS_ROOT} -p #{port} \
      -P #{RAILS_ROOT}/log/mongrel.#{port}.pid  -d"
    w.stop = "mongrel_rails stop -P #{RAILS_ROOT}/log/mongrel.#{port}.pid"
    w.restart = "mongrel_rails restart -P #{RAILS_ROOT}/log/mongrel.#{port}.pid"

    w.pid_file = File.join(RAILS_ROOT, "log/mongrel.#{port}.pid")

    ...
  end
```

A `watch` represents a single process that has concrete start, stop, and/or
restart operations. You can define as many watches as you like. In the example
above, I've got some Rails instances running in Mongrels that I need to keep
alive. Every watch must have a unique `name` so that it can be identified
later on. The `start` and `stop` attributes specify the commands to start 
and stop the process. If no `restart` attribute is set, restart will be 
represented by a call to stop followed by a call to start. The
optional `grace` attribute sets the amount of time following a
start/stop/restart command to wait before resuming normal monitoring
operations. If the process you're watching runs as a daemon (as
mine does), you'll need to set the `pid_file` attribute.

```ruby
    w.behavior(:clean_pid_file)
```

Behaviors allow you to execute additional commands around start/stop/restart
commands. In our case, if the process dies it will leave a PID file behind.
The next time a start command is issued, it will fail, complaining about the
leftover PID file. We'd like the PID file cleaned up before a start command is
issued. The built-in behavior `clean_pid_file` will do just that.

```ruby
    w.start_if do |start|
      start.condition(:process_running) do |c|
        c.interval = 5.seconds
        c.running = false
      end
    end
```

Watches contain conditions grouped by the action to execute should they return
`true`. I start with a `start_if` block that contains a single condition.
Conditions are specified by calling `condition` with an identifier, in this
case `:process_running`. Each condition can specify a poll interval that will
override the default watch interval. In this case, I want to check that the
process is still running every 5 seconds instead of the 30 second interval
that other conditions will inherit. The ability to set condition specific poll
intervals makes it possible to run critical tests (such as :process_running)
more often than less critical tests (such as :memory_usage and :cpu_usage).

```ruby
    w.restart_if do |restart|
      restart.condition(:memory_usage) do |c|
        c.above = 150.megabytes
        c.times = [3, 5] # 3 out of 5 intervals
      end

      ...
    end
```

Similar to `start_if` there is a `restart_if` command that groups conditions
that should trigger a restart. The `memory_usage` condition will fail if the
specified process is using too much memory. The maximum allowable amount of
memory is specified with the `above` attribute (you can use the `kilobytes`,
`megabytes`, or `gigabytes` helpers). The number of times the test needs to
fail in order to trigger a restart is set with `times`. This can be either an
integer or an array. An integer means it must fail that many times in a row
while an array `[x, y]` means it must fail `x` times out of the last `y`
tests.

```ruby
    w.restart_if do |restart|
      ...

      restart.condition(:cpu_usage) do |c|
        c.above = 50.percent
        c.times = 5
      end
    end
```

To keep an eye on CPU usage, I've employed the `cpu_usage` condition. When CPU
usage for a Mongrel process is over 50% for 5 consecutive intervals, it will
be restarted.

```ruby
    w.lifecycle do |on|
      on.condition(:flapping) do |c|
        c.to_state = [:start, :restart]
        c.times = 5
        c.within = 5.minute
        c.transition = :unmonitored
        c.retry_in = 10.minutes
        c.retry_times = 5
        c.retry_within = 2.hours
      end
    end
```

Conditions inside a `lifecycle` section are active as long as the process is being monitored (they live across state changes).

The `:flapping` condition guards against the edge case wherein god rapidly
starts or restarts your application. Things like server configuration changes
or the unavailability of external services could make it impossible for my
process to start. In that case, god will try to start my process over and over
to no avail. The `:flapping` condition provides two levels of giving up on
flapping processes. If I were to translate the options of the code above, it
would be something like: If this watch is started or restarted five times
withing 5 minutes, then unmonitor it...then after ten minutes, monitor it
again to see if it was just a temporary problem; if the process is seen to be
flapping five times within two hours, then give up completely.

That's it!

/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Starting and Controlling God
----------------------------

To start the god monitoring process as a daemon simply run the `god`
executable passing in the path to the config file (you need to sudo if you're
using events on Linux or want to use the setuid/setgid functionality):

```terminal
$ sudo god -c /path/to/config.god
```

While you're writing your config file, it can be helpful to run god in the
foreground so you can see the log messages. You can do that with:

```terminal
$ sudo god -c /path/to/config.god -D
```

You can start/restart/stop/monitor/unmonitor your Watches with the same
utility like so:

```terminal
$ sudo god stop gravatar2-mongrel-8200
```

/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Watching Non-Daemon Processes
-----------------------------

Need to watch a script that doesn't have built in daemonization? No problem!
God will daemonize and keep track of your process for you. If you don't
specify a `pid_file` attribute for a watch, it will be auto-daemonized and a
PID file will be stored for it in `/var/run/god`. 


```ruby
God.pid_file_directory = '/home/tom/pids'

# Watcher that auto-daemonizes and creates the pid file
God.watch do |w|
  w.name = 'mongrel'
  w.pid_file = w.pid_file = File.join(RAILS_ROOT, "log/mongrel.pid")
  
  w.start = "mongrel_rails start -P #{RAILS_ROOT}/log/mongrel.pid  -d"
  
  # ...
end

# Watcher that does not auto-daemonize
God.watch do |w|
  w.name = 'worker'
  # w.pid_file = is not set
  
  w.start = "rake resque:worker"
  
  # ...
end
```


If you'd rather have the PID file stored in a different location, you can 
set it at the top of your config:

```ruby
God.pid_file_directory = '/home/tom/pids'
```

The directory you specify must be writable by god.


/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Grouping Watches
----------------

Watches can be assigned to groups. These groups can then be controlled
together from the command line.

```ruby
  God.watch do |w|
    ...

    w.group = 'mongrels'

    ...
  end
```

The above configuration now allows you to control the watch (and any others
that are in the group) with a single command:

```terminal
$ sudo god stop mongrels
```

/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Redirecting STDOUT and STDERR of your Process
---------------------------------------------

By default, the STDOUT stream for your process is redirected to `/dev/null`.
To get access to this output, you can redirect the stream either to a file or
to a command.

To redirect STDOUT to a file, set the `log` attribute to a file path. The file
will be written in append mode and created if it does not exist.

```ruby
  God.watch do |w|
    ...

    w.log = '/var/log/myprocess.log'

    ...
  end
```

To redirect STDOUT to a command that will be run for you, set the `log_cmd`
attribute to a command.

```ruby
  God.watch do |w|
    ...

    w.log_cmd = '/usr/bin/logger'

    ...
  end
```

By default, STDERR is redirected to STDOUT. You can redirect it to a file or a
command just like STDOUT by setting the `err_log` or `err_log_cmd` attributes
respectively.

/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Changing UID/GID for processes
------------------------------

It is possible to have god run your start/stop/restart commands as a specific
user/group. This can be done by setting the `uid` and/or `gid` attributes of a
watch.

```ruby
  God.watch do |w|
    ...

    w.uid = 'tom'
    w.gid = 'devs'

    ...
  end
```

This only works for commands specified as a string. Lambda commands are
unaffected.

/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Setting the Working Directory
-----------------------------

By default, God sets the working directory to `/` before running your process.
You can change this by setting the `dir` attribute on the watch.

```ruby
  God.watch do |w|
    ...

    w.dir = '/var/www/myapp'

    ...
  end
```

/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Setting environment variables
-----------------------------

You can set any number of environment variables you wish via the `env`
attribute of a watch.

```ruby
  God.watch do |w|
    ...

    w.env = { 'RAILS_ROOT' => "/var/www/myapp",
              'RAILS_ENV' => "production" }

    ...
  end
```

/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Using chroot to Change the File System Root
-------------------------------------------

If you want your process to run chrooted, simply use the `chroot` attribute on
the watch. The specified directory must exist and have a `/dev/null`.

```ruby
  God.watch do |w|
    ...

    w.chroot = '/var/myroot'

    ...
  end
```

/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Lambda commands
---------------

In addition to specifying start/stop/restart commands as strings (to be
executed via the shell), you can specify a lambda that will be called.

```ruby
  God.watch do |w|
    ...

    w.start = lambda { ENV['APACHE'] ? `apachectl -k graceful` : `lighttpd restart` }

    ...
  end
```

/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Customizing the Default Stop Lambda
-----------------------------------

If you do not provide a stop command, God will attempt to stop your process by
first sending a SIGTERM. It will then wait for ten seconds for the process to
exit. If after this time it still has not exited, it will be sent a SIGKILL.
You can customize the stop signal and/or the time to wait for the process to
exit by setting the `stop_signal` and `stop_timeout` attributes on the watch.

```ruby
  God.watch do |w|
    ...

    w.stop_signal = 'QUIT'
    w.stop_timeout = 20.seconds

    ...
  end
```


/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Loading Other Config Files
--------------------------

You should feel free to separate your god configs into separate files for
easier organization. You can load in other configs using Ruby's normal `load`
method, or use the convenience method `God.load` which allows for glob-style
paths:

```ruby
# load in all god configs
God.load "/usr/local/conf/*.god"
```

God won't start its monitoring operations until all configurations have been
loaded.

/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Dynamically Loading Config Files Into an Already Running God
------------------------------------------------------------

God allows you to load or reload configurations into an already running
instance. There are a few things to consider when doing this:

* Existng Watches with the same `name` as the incoming Watches will be
  overidden by the new config.
* All paths must be either absolute or relative to the path from which god was
  started.

To load a config into a running god, issue the following command:

```terminal
$ sudo god load path/to/config.god
```

Config files that are loaded dynamically can contain anything that a normal
config file contains, however, global options such as `God.pid_file_directory`
blocks will be ignored (and produce a warning in the logs).

/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Getting Logs for a Single Watch
-------------------------------

Sifting through the god logs for statements specific to a single Watch can be
frustrating when you have many of them. You can get the realtime logs for a
single Watch via the command line:

```terminal
$ sudo god log local-3000
```

This will display log output for the 'local-3000' Watch and update every
second with new log messages.

You can also supply a shorthand to the log command that will match one of your
watches. If it happens to match several, the shortest match will be used:

```terminal
$ sudo god log l3
```

/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Notifications
-------------

God has an extensible notification framework built in that makes it easy to
have notifications sent when conditions are triggered. Each notification type
has a set of configuration parameters that must be set. These parameters may
be set globally via Contact Defaults or individually via Contact Instances.

*Contact Defaults* - Some parameters are unlikely to change on a per-contact
basis. You should set those parameters via the defaults mechanism.

```ruby
God::Contacts::Email.defaults do |d|
  d.from_email = 'god@example.com'
  d.from_name = 'God'
  d.delivery_method = :sendmail
end
```

*Contact Instances* - Each contact must have a unique `name` set. You may
optionally assign each contact to a `group`.

```ruby
God.contact(:email) do |c|
  c.name = 'tom'
  c.group = 'developers'
  c.to_email = 'tom@example.com'
end

God.contact(:email) do |c|
  c.name = 'vanpelt'
  c.group = 'developers'
  c.to_email = 'vanpelt@example.com'
end

God.contact(:email) do |c|
  c.name = 'kevin'
  c.group = 'developers'
  c.to_email = 'kevin@example.com'
end
```

*Condition Attachment* - To have a specific contact notified when a condition
is triggered, simply set the condition's `notify` attribute to the name of the
individual contact.

```ruby
  w.transition(:up, :start) do |on|
    on.condition(:process_exits) do |c|
      c.notify = 'tom'
    end
  end
```

There are two ways to specify that a notification should be sent. The first,
easier way is shown above. Every condition can take an optional `notify`
attribute that specifies which contacts should be notified when the condition
is triggered. The value can be a contact name or contact group *or* an array
of contact names and/or contact groups.

```ruby
  w.transition(:up, :start) do |on|
    on.condition(:process_exits) do |c|
      c.notify = {:contacts => ['tom', 'developers'], :priority => 1, :category => 'product'}
    end
  end
```

The second way allows you to specify the `priority` and `category` in addition
to the contacts. The extra attributes can be arbitrary integers or strings and
will be passed as-is to the notification subsystem.

The above notification will arrive as an email similar to the following.

```
From: God &lt;god@example.com&gt;
To: tom &lt;tom@example.com&gt;
Subject: [god] mongrel-8600 [trigger] process exited (ProcessExits)

Message: mongrel-8600 [trigger] process exited (ProcessExits)
Host: candymountain.example.com
Priority: 1
Category: product
```

Available Notification Types
----------------------------

Campfire
~~~~~~~~

Send a notice to a Campfire room (http://campfirenow.com).

```ruby
God::Contacts::Campfire.defaults do |d|
  ...
end

God.contact(:campfire) do |c|
  ...
end
```

```
subdomain - The String subdomain of the Campfire account. If your URL is
            "foo.campfirenow.com" then your subdomain is "foo".
token     - The String token used for authentication.
room      - The String room name to which the message should be sent.
ssl       - A Boolean determining whether or not to use SSL
            (default: false).
```

Email
~~~~~

Send a notice to an email address.

```ruby
God::Contacts::Email.defaults do |d|
  ...
end

God.contact(:email) do |c|
  ...
end
```

```
to_email        - The String email address to which the email will be sent.
to_name         - The String name corresponding to the recipient.
from_email      - The String email address from which the email will be sent.
from_name       - The String name corresponding to the sender.
delivery_method - The Symbol delivery method. [ :smtp | :sendmail ]
                  (default: :smtp).

=== SMTP Options (when delivery_method = :smtp) ===
server_host     - The String hostname of the SMTP server (default: localhost).
server_port     - The Integer port of the SMTP server (default: 25).
server_auth     - A Boolean or Symbol, false if no authentication else a symbol
                  for the type of authentication [false | :plain | :login | :cram_md5]
                  (default: false).

=== SMTP Auth Options (when server_auth = true) ===
server_domain   - The String domain.
server_user     - The String username.
server_password - The String password.

=== Sendmail Options (when delivery_method = :sendmail) ===
sendmail_path   - The String path to the sendmail executable
                  (default: "/usr/sbin/sendmail").
sendmail_args   - The String args to send to sendmail (default "-i -t").
```

Jabber
~~~~~~

Send a notice to a Jabber address (http://jabber.org/).

Google Mail addresses should work. If you need a non-Gmail address, you can
sign up for one at http://register.jabber.org/.

```ruby
God::Contacts::Jabber.defaults do |d|
  ...
end

God.contact(:jabber) do |c|
  ...
end
```

```
host     - The String hostname of the Jabber server.
port     - The Integer port of the Jabber server.
from_jid - The String Jabber ID of the sender.
password - The String password of the sender.
to_jid   - The String Jabber ID of the recipient.
subject  - The String subject of the message (default: "God Notification").
```

Prowl
~~~~~

Send a notice to Prowl (http://prowl.weks.net/).

```ruby
God::Contacts::Prowl.defaults do |d|
  ...
end

God.contact(:prowl) do |c|
  ...
end
```

```
apikey - The String API key.
```

Scout
~~~~~

Send a notice to Scout (http://scoutapp.com/).

```ruby
God::Contacts::Scout.defaults do |d|
  ...
end

God.contact(:scout) do |c|
  ...
end
```

```
client_key - The String client key.
plugin_id  - The String plugin id.

```

Twitter
~~~~~~~

Send a notice to a Twitter account (http://twitter.com/).

In order to use the Twitter notification, you will need to authorize God via
OAuth and then get the OAuth token and secret for your account. The easiest
way to do this is with a Ruby gem called `twurl`. Install it like so:

```terminal
[sudo] gem install twurl
```

Then, run the following:

```terminal
twurl auth --consumer-key gOhjax6s0L3mLeaTtBWPw \
           --consumer-secret yz4gpAVXJHKxvsGK85tEyzQJ7o2FEy27H1KEWL75jfA
```

This will return a URL. Copy it to your clipboard. Make sure you are logged
into Twitter with the account that will used for the notifications, and then
paste the URL into a new browser window. At the end of the authentication
process, you will be given a PIN. Copy this PIN and paste it back to the
command line prompt. Once this is complete, you need to find your access token
and secret:

```terminal
cat ~/.twurlrc
```

This will output the contents of the config file from which you can grab your
access token and secret:

```
---
profiles:
  mojombo:
    gOhjax6s0L3mLeaTtBWPw:
      [red]token: 17376380-KXA91nCrgaQ4HxUXMmZtM38gB56qS3hx1NYbjT6mQ
      consumer_key: gOhjax6s0L3mLeaTtBWPw
      username: mojombo
      consumer_secret: yz4gpAVXJHKxvsGK85tEyzQJ7o2FEy27H1KEWL75jfA
      [red]secret: EBWFQBCtuMwCDeU4OXlc3LwGyY8OdWAV0Jg5KVB0
configuration:
  default_profile:
  - mojombo
  - gOhjax6s0L3mLeaTtBWPw

```

The access token and secret (highlighted in red above) are what you need to
use as parameters to the Twitter notification.

```ruby
God::Contacts::Twitter.defaults do |d|
  ...
end

God.contact(:twitter) do |c|
  ...
end
```

```
consumer_token  - The String OAuth consumer token (defaults to God's
                  existing consumer token).
consumer_secret - The String OAuth consumer secret (defaults to God's
                  existing consumer secret).
access_token    - The String OAuth access token.
access_secret   - The String OAuth access secret.
```

Webhook
~~~~~~~

Send a notice to a webhook (http://www.webhooks.org/).

```ruby
God::Contacts::Webhook.defaults do |d|
  ...
end

God.contact(:webhook) do |c|
  ...
end
```

```
url    - The String webhook URL.
format - The Symbol format [ :form | :json ] (default: :form).

```

/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Advanced Configuration with Transitions and Events
--------------------------------------------------

So far you've been introduced to a simple poll-based config file and seen how
to run it. Poll-based monitoring works great for simple things, but falls
short for highly critical tasks. God has native support for kqueue/netlink
events on BSD/Darwin/Linux systems. For instance, instead of using the
`process_running` condition to poll for the status of your process, you can
use the `process_exits` condition that will be notified *immediately* upon the
exit of your process. This means less load on your system and shorter downtime
after a crash.

While the configuration syntax you saw in the previous example is very simple,
it lacks the power that we need to deal with event based monitoring. In fact,
the `start_if` and `restart_if` methods are really just calling out to a
lower-level API. If we use the low-level API directly, we can harness the full
power of god's event based lifecycle system. Let's look at another example
config file.

```ruby
RAILS_ROOT = "/Users/tom/dev/gravatar2"

God.watch do |w|
  w.name = "local-3000"

  w.start = "mongrel_rails start -c #{RAILS_ROOT} -P #{RAILS_ROOT}/log/mongrel.pid -p 3000 -d"
  w.stop = "mongrel_rails stop -P #{RAILS_ROOT}/log/mongrel.pid"
  w.restart = "mongrel_rails restart -P #{RAILS_ROOT}/log/mongrel.pid"

  w.pid_file = File.join(RAILS_ROOT, "log/mongrel.pid")

  # clean pid files before start if necessary
  w.behavior(:clean_pid_file)

  # determine the state on startup
  w.transition(:init, { true => :up, false => :start }) do |on|
    on.condition(:process_running) do |c|
      c.running = true
    end
  end

  # determine when process has finished starting
  w.transition([:start, :restart], :up) do |on|
    on.condition(:process_running) do |c|
      c.running = true
    end

    # failsafe
    on.condition(:tries) do |c|
      c.times = 5
      c.transition = :start
    end
  end

  # start if process is not running
  w.transition(:up, :start) do |on|
    on.condition(:process_exits)
  end

  # restart if memory or cpu is too high
  w.transition(:up, :restart) do |on|
    on.condition(:memory_usage) do |c|
      c.interval = 20
      c.above = 50.megabytes
      c.times = [3, 5]
    end

    on.condition(:cpu_usage) do |c|
      c.interval = 10
      c.above = 10.percent
      c.times = [3, 5]
    end
  end

  # lifecycle
  w.lifecycle do |on|
    on.condition(:flapping) do |c|
      c.to_state = [:start, :restart]
      c.times = 5
      c.within = 5.minute
      c.transition = :unmonitored
      c.retry_in = 10.minutes
      c.retry_times = 5
      c.retry_within = 2.hours
    end
  end
end

```

A bit longer, I know, but very straighforward once you understand how the
`transition` calls work. The `name`, `interval`, `start`, `stop`, and
`pid_file` attributes should be familiar. We also specify the `clean_pid_file`
behavior.

Before jumping into the code, it's important to understand the different
states that a Watch can have, and how that state changes over time. At any
given time, a Watch will be in one of the `init`, `up`, `start`, or `restart`
states. As different conditions are satisfied, the Watch will progress from
state to state, enabling and disabling conditions along the way.

When god first starts, each Watch is placed in the `init` state.

You'll use the `transition` method to tell god how to transition between
states. It takes two arguments. The first argument may be either a symbol or
an array of symbols representing the state or states during which the
specified conditions should be enabled. The second argument may be either a
symbol or a hash. If it is a symbol, then that is the state that will be
transitioned to if any of the conditions return `true`. If it is a hash, then
that hash must have both `true` and `false` keys, each of which point to a
symbol that represents the state to transition to given the corresponding
return from the single condition that must be specified.

```ruby
  # determine the state on startup
  w.transition(:init, { true => :up, false => :start }) do |on|
    on.condition(:process_running) do |c|
      c.running = true
    end
  end
```

The first transition block tells god what to do when the Watch is in the
`init` state (first argument). This is where I tell god how to determine if my
task is already running. Since I'm monitoring a process, I can use the
`process_running` condition to determine whether the process is running. If
the process is running, it will return true, otherwise it will return false.
Since I sent a hash as the second argument to `transition`, the return from
`process_running` will determine which of the two states will be transitioned
to. If the process is running, the return is true and god will put the Watch
into the `up` state. If the process is not running, the return is false and
god will put the Watch into the `start` state.

```ruby
  # determine when process has finished starting
  w.transition([:start, :restart], :up) do |on|
    on.condition(:process_running) do |c|
      c.running = true
    end

    ...
  end
```

If god has determined that my process isn't running, the Watch will be put
into the `start` state. Upon entering this state, the `start` command that I
specified on the Watch will be called. In addition, the above transition
specifies a condition that should be enabled when in either the `start` or
`restart` states. The condition is another `process_running`, however this
time I'm only interested in moving to another state once it returns `true`. A
`true` return from this condition means that the process is running and it's
ok to transition to the `up` state (second argument to `transition`).

```ruby
  # determine when process has finished starting
  w.transition([:start, :restart], :up) do |on|
    ...

    # failsafe
    on.condition(:tries) do |c|
      c.times = 5
      c.transition = :start
    end
  end
```

The other half of this transition uses the `tries` condition to ensure that
god doesn't get stuck in this state. It's possible that the process could go
down while the transition is being made, in which case god would end up
polling forever to see if the process is up. Here I've specified that if this
condition is called five times, god should override the normal transition
destination and move to the `start` state instead. If you specify a
`transition` attribute on any condition, that state will be transferred to
instead of the normal transfer destination.

```ruby
  # start if process is not running
  w.transition(:up, :start) do |on|
    on.condition(:process_exits)
  end
```

This is where the event based system comes into play. Once in the `up` state,
I want to be notified when my process exits. The `process_exits` condition
registers a callback that will trigger a transition change when it is fired
off. Event conditions (like this one) cannot be used in transitions that have
a hash for the second argument (as they do not return true or false).

```ruby
  # restart if memory or cpu is too high
  w.transition(:up, :restart) do |on|
    on.condition(:memory_usage) do |c|
      c.interval = 20
      c.above = 50.megabytes
      c.times = [3, 5]
    end

    on.condition(:cpu_usage) do |c|
      c.interval = 10
      c.above = 10.percent
      c.times = [3, 5]
    end
  end
```

Notice that I can have multiple transitions with the same start state. In this
case, I want to have the `memory_usage` and `cpu_usage` poll conditions going
at the same time that I listen for the process exit event. In the case of
runaway CPU or memory usage, however, I want to transition to the `restart`
state. When a Watch enters the `restart` state it will either call the
`restart` command that you specified, or if none has been set, call the `stop`
and then `start` commands.


/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Extend God with your own Conditions
-----------------------------------

God was designed from the start to allow you to easily write your own custom
conditions, making it simple to add tests that are application specific.


/////////////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////////////

Contribute
----------

If you'd like to hack on god itself or contribute fixes or new functionality,
read this section.

The codebase can be found at https://github.com/mojombo/god. To get started,
fork god on GitHub into your own account and then pull that down to your local
machine. This way you can easily submit changes via Pull Requests later on.

```terminal
$ git clone git@github.com:yourusername/god
```

We recommend using link:https://github.com/sstephenson/rbenv[rbenv] and
link:https://github.com/sstephenson/ruby-build[ruby-build] to manage multiple
versions of Ruby and their separate gemsets. Any changes to god must work on
both Ruby 1.8.7-p352 and 1.9.3-p0.

God uses link:http://gembundler.com/[bundler] to deal with development
dependencies. Once you have the code locally, you can pull in all the
dependencies like so:

```terminal
$ cd god
$ bundle install
```

In order for process events to function during development you'll need to
compile the C extensions:

```terminal
$ cd ext/god
$ ruby extconf.rb
$ make
$ cd ../..
```

Now you're ready to run the tests and make sure everything is configured
properly.  On Linux you'll need to run the tests as root in order for the
events system to load. On MacOS there is no need to run the tests as root.

```terminal
$ [sudo] bundle exec rake
```

To run your development god to make sure config files and such still work
properly, just run:

```terminal
$ [sudo] bundle exec god -c myconfig.god -D
```

There are a bunch of example config files for various scenarios in
`test/configs` that you can try out. For big new features, it's great to add a
new test config showing off the usage of the feature.

If you intend to contribute your changes back to god core, make sure you create
a new branch and do your work there. Then, when your changes are ready to be
shared with the world, push them to your fork and issue a Pull Request against
mojombo/god. Make sure to describe your changes in detail and add relevant
tests.

Any feature additions or changes should be accompanied by corresponding updates
to the documentation. It can be found in the `docs` directory. The
documentation is done in link:http://github.com/github/gollum[Gollum] format
and then converted into the public site at http://godrb.com. To see the
generated site locally you'll first need to commit your changes to git and then
issue the following:

```terminal
$ bundle exec rake site
```

This will open the site in your browser so you can check for correctness.