Nudging SAS Viya Services Timeout

I had been puzzling over why some SAS® Viya™ services were not starting on a machine reboot. Initially I thought the answer appeared in the SAS Viya 3.2 Administration documentation set: see the General Servers and Services: Troubleshooting section.

I found that all the expected services started after:

[root@hostname ~]# /etc/init.d/sas-viya-all-services stop
[root@hostname ~]# rm -f /opt/sas/viya/config/data/consul/checks/*
[root@hostname ~]# /etc/init.d/sas-viya-all-services start
[root@hostname ~]# /etc/init.d/sas-viya-all-services status

However, on further investigation it turned out that it probably wasn’t a problem with those consul/checks files. After another reboot I found that, once again, only a subset of the services had started. Using systemctl to check the status I found the following:

[root@hostname ~]# systemctl status sas-viya-all-services
 sas-viya-all-services.service - start and stop all SAS services
   Loaded: loaded (/usr/lib/systemd/system/sas-viya-all-services.service; enabled; vendor preset: disabled)
   Active: failed (Result: timeout) since Mon 2017-04-24 13:33:54 AEST; 31min ago
  Process: 717 ExecStart=/etc/init.d/sas-viya-all-services start (code=killed, signal=TERM)
 Main PID: 717 (code=killed, signal=TERM)
   CGroup: /system.slice/sas-viya-all-services.service

Apr 24 13:33:12 hostname su[12824]: (to sasrabbitmq) root on none
Apr 24 13:33:30 hostname sas-viya-all-services[717]: There are still 1 pending processes
Apr 24 13:33:30 hostname sas-viya-all-services[717]: Starting sas-viya-datatables-default
Apr 24 13:33:30 hostname sas-viya-all-services[717]: Starting sas-viya-deploymentBackup-default
Apr 24 13:33:30 hostname sas-viya-all-services[717]: Starting sas-viya-device-management-default
Apr 24 13:33:30 hostname sas-viya-all-services[717]: Pausing to allow services time to start...
Apr 24 13:33:54 hostname systemd[1]: sas-viya-all-services.service start operation timed out. Terminating.
Apr 24 13:33:54 hostname systemd[1]: Failed to start start and stop all SAS services.
Apr 24 13:33:54 hostname systemd[1]: Unit sas-viya-all-services.service entered failed state.
Apr 24 13:33:54 hostname systemd[1]: sas-viya-all-services.service failed.

So it was due to the amount of time it was taking sas-viya-all-services to start all the services. This is a simple dev/test deployment with everything on one machine, unlike a real deployment where they are much more likely to be distributed over multiple machines. I needed to bump up the timeout for sas-viya-all-services to allow it to complete.

I could see the current timeout settings with:

[root@hostname ~]# systemctl show sas-viya-all-services.service | grep ^Timeout
TimeoutStartUSec=15min
TimeoutStopUSec=15min

… so I bumped the timeout up to 60 minutes to give it more than enough time:

[root@hostname ~]# sed -i s/TimeoutSec=15min/TimeoutSec=60min/ /usr/lib/systemd/system/sas-viya-all-services.service
[root@hostname ~]# systemctl daemon-reload
[root@hostname ~]# systemctl show sas-viya-all-services.service | grep ^Timeout
TimeoutStartUSec=1h
TimeoutStopUSec=1h

… did another reboot and watched the progress with:

[root@hostname ~]# tail -f `ls -1 /opt/sas/viya/config/var/log/all-services/default/all-services*.log | tail -n 1`

… then when complete verified with:

[root@hostname ~]# systemctl status sas-viya-all-services
 sas-viya-all-services.service - start and stop all SAS services
   Loaded: loaded (/usr/lib/systemd/system/sas-viya-all-services.service; enabled; vendor preset: disabled)
   Active: active (exited) since Mon 2017-04-24 15:09:15 AEST; 2min 24s ago
  Process: 722 ExecStart=/etc/init.d/sas-viya-all-services start (code=exited, status=0/SUCCESS)
 Main PID: 722 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/sas-viya-all-services.service

Apr 24 15:09:15 hostname sas-viya-all-services[722]: sas-viya-backup-agent-default                          00:01:41
Apr 24 15:09:15 hostname sas-viya-all-services[722]: sas-viya-environmentmanager-default                    00:02:41
Apr 24 15:09:15 hostname sas-viya-all-services[722]: sas-viya-monitoring-default                            00:02:41
Apr 24 15:09:15 hostname sas-viya-all-services[722]: sas-viya-sashome-default                               00:02:41
Apr 24 15:09:15 hostname sas-viya-all-services[722]: sas-viya-sasreportviewer-default                       00:02:47
Apr 24 15:09:15 hostname sas-viya-all-services[722]: sas-viya-sasthemedesigner-default                      00:02:28
Apr 24 15:09:15 hostname sas-viya-all-services[722]: sas-viya-sasvisualanalytics-default                    00:02:27
Apr 24 15:09:15 hostname sas-viya-all-services[722]: sas-viya-sasvisualdatabuilder-default                  00:02:27
Apr 24 15:09:15 hostname sas-viya-all-services[722]: sas-services completed in 00:31:05
Apr 24 15:09:15 hostname systemd[1]: Started start and stop all SAS services.

… or:

[root@hostname ~]# /etc/init.d/sas-viya-all-services status
Getting service info from consul...
  Service                                            Status     Host               Port     PID
  sas-viya-consul-default                            up         N/A                 N/A    3727
  sas-viya-sasdatasvrc-postgres-node0-ct-pg_hba      up         N/A                 N/A    3969
  sas-viya-sasdatasvrc-postgres-node0-ct-postgresql  up         N/A                 N/A    3991
  sas-viya-sasdatasvrc-postgres-pgpool0-ct-pcp       up         N/A                 N/A    4036
  sas-viya-sasdatasvrc-postgres-pgpool0-ct-pgpool    up         N/A                 N/A    4054
  sas-viya-sasdatasvrc-postgres-pgpool0-ct-pool_hba  up         N/A                 N/A    4278
  sas-viya-sasdatasvrc-postgres                      up         N/A                 N/A    4740
  sas-viya-cascontroller-default                     up         N/A                 N/A     930
  sas-viya-httpproxy-default                         up         N/A                 N/A    6290
  sas-viya-rabbitmq-server-default                   up         N/A                 N/A    6039
  sas-viya-sasdatasvrc-postgres-node0                up         N/A                 N/A    4689
  sas-viya-sasstudio-default                         up         N/A                 N/A    1115
  sas-viya-spawner-default                           up         N/A                 N/A     858
...
  sas-viya-sasvisualanalytics-default                up         10.10.10.10       43115   11303
  sas-viya-sasvisualdatabuilder-default              up         10.10.10.10       37424   11334

sas-services completed in 00:00:18

Given the extra time, now all of the SAS Viya services get a chance to start after a machine reboot.

5 thoughts on “Nudging SAS Viya Services Timeout”

  1. I have recent experience of installing SAS Viya 3.2 and experiencing moreover same issue while start the servers, the pgpool services not started properly and it affected the other services too. But the issue persists only if the server stop abnormally or if you stop the server without properly stopping the Viya services. One of the workaround is stop all the viya services before stopping/reboot the server and hopefully you will not face the issue while starting/reboot the services/server again.

  2. Hi Sanket,

    Thanks for your sharing your experiences. That’s not something that I’ve encountered yet but I’ll keep an eye out for it. I’ll probably switch from the default auto-start-on-boot/auto-stop-on-shutdown to a manual-start-on-boot/manual-stop-on-shutdown for this dev/test environment. I like an opportunity to investigate and tweak before starting/stopping the services.

    Cheers
    Paul

  3. Hi Paul,

    Great post, another accidental google led me here. This solves a bit of a problem for us; it not being a solution I was actively searching for makes it all the better.

    Sanket, I’ve seen the issue with the pgpool being caused by the improper shutdown of postgres a number of times too. I’m curious as to whether increasing the timeout when the same script is used to stop pg might actually fix the issue. It certainly feels like the script isn’t blocking the shutdown for long enough to allow postgres to stop cleanly.

    Nik

  4. I am facing the issue while starting the services in micro-services
    Unable to status the consul leader.
    waiting for consul service to start
    Appreciate your help..!!

  5. I haven’t seen that specific error so have no suggestions other than to check the logs for further clues, post to https://communities.sas.com/ to appeal to a wider audience, and engage the expertise of SAS Technical Support.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.