Quantcast
Channel: VMware Communities : Popular Discussions - VIX API
Viewing all articles
Browse latest Browse all 34639

VIX Perl routines sometimes never return

$
0
0

I have a script managing the state of several VMs on the local host running VMware Server 1.0x.  I'm trying to use Perl VIX to do some of the automation and notice that sometimes I've seen certain routines never exit (waiting at least a couple hours).  I've seen this with:

  • VMPowerOn()

  • VMRevertToSnapshot()

  • GetProperties($vmHandle,VIX_PROPERTY_VM_POWER_STATE)

 

At the time of seeing the failure to return, I had these routines doing what I wanted them to do.  Its just that sometimes they don't return (they hang).  I don't know of anything different between the times they work and the times they don't.

 

For VMPowerOn() and VMRevertToSnapshot(), I tried to work around this occasional problem by introducing a timeout with code like the following:

 

sub do_revert_with_timeout () {

    my($vmHandle,$snapshotHandle,$timeout)= @_;

    return undef unless defined($fn);

    $timeout= 150 unless defined $timeout;

 

    my $err= VIX_OK;

    eval {

        $SIG{ALRM}= sub { die "timed out after $timeout seconds\n" };

        alarm($timeout);

 

         

  1. on timeout, alarm handler above will execute and we'll fall out of this eval

         

  1. on normal exit, we'll fall out of the bottom of the eval with no error

        $err= VMRevertToSnapshot($vmHandle,$snapshotHandle,0,VIX_INVALID_HANDLE);

        alarm(0);

        $SIG= 'IGNORE';

    };

    my $elapsed= time()-$start;

    if ($@) {

        if ($@ =~ /timed out after/) { # we timed out

            print "$@\n";

            return 0;

        } else { # the method call did a die

             

  1. propagate

            alarm(0);

            die;

        }

    }

    return 1;

}

 

However, my alarm never goes off (perhaps the method uses SIGALRM internally).

 

For GetProperties, I had used it thousands of times over weeks and hadn't noticed it having this behavior until today.

 

Does anyone know what is causing this or how to avoid?  Or has anyone else seen this?

 

Alternately, does anyone have a suggestion on how to time this out (without requiring a fork())?  (Guess I could arrange for a SIGUSR1 to be sent to the process periodically and the interrupt handler could check how long it has been.)

 

Thanks.


Viewing all articles
Browse latest Browse all 34639

Trending Articles