Service Fabric local development cluster with 5 nodes runs less instances and less partitions than expected

1.3k views Asked by At

I am running a Service Fabric application on a local development cluster with 5 nodes "simulated" on my PC.

The application has a public API stateless service with instance count set to -1.

I expect to see 5 instances of the stateless service in Service Fabric Explorer but I only see 1.

The application also has an actor service with partition count set to 10 (auto generated configuration by Visual Studio).

When the application is deployed to the development cluster on my PC only one partition can be seen in the Service Fabric Explorer. After I simulate a "big" load and the CPU and the Memory usage of my PC are around and above 90% there is still only one partition of the actor service. I made a stateful service with partition count set to 5 to check if there is something wrong with my environment, but it runs as expected.

Is this normal for stateless services or there is something wrong with my configuration. Is this behavior specific for development cluster, set to avoid things like port conflict.

What about the actor service. According to the docs Dynamic partition scaling is possible, but the number of partitions for the actor service does not increase even during high load. In addition there is nothing mentioned for dynamic partition scaling in the Actor docs.

Related

Thanks in advance!

EDIT: After some tests with different configurations I got it working.

Original configuration in ApplicaitonManifest.xml:

<Parameters>
   ...
    <Parameter Name="HttpAPI_InstanceCount" DefaultValue="-1" />

    <Parameter Name="SystemStatusConsumerActorService_PartitionCount" 
               DefaultValue="10" />
   ...
</Parameters>

<DefaultServices>
    <Service Name="HttpAPI">
      <StatelessService ServiceTypeName="HttpAPIType" 
                        InstanceCount="[HttpAPI_InstanceCount]">
        <SingletonPartition />
      </StatelessService>
    </Service>

    <Service Name="SystemStatusConsumerActorService" 
             GeneratedIdRef="faad4d24-04db-4e06-8a1d-22bc6255c7fe|Persisted">

      <StatefulService ServiceTypeName="SystemStatusConsumerActorServiceType" TargetReplicaSetSize="SystemStatusConsumerActorService_TargetReplicaSetSize]" MinReplicaSetSize="[SystemStatusConsumerActorService_MinReplicaSetSize]">

        <UniformInt64Partition 
          PartitionCount="[SystemStatusConsumerActorService_PartitionCount]" 
          LowKey="-9223372036854775808" 
          HighKey="9223372036854775807" />
      </StatefulService>
    </Service>
</DefaultServices>

The configuration that works:

<Parameters>
   ...
    <Parameter Name="HttpAPIInstanceCount" DefaultValue="-1" />

    <Parameter Name="SystemStatusConsumerActorServicePartitionCount" 
               DefaultValue="10" />
   ...
</Parameters>

<DefaultServices>
    <Service Name="HttpAPI">
      <StatelessService ServiceTypeName="HttpAPIType" 
                        InstanceCount="[HttpAPIInstanceCount]">
        <SingletonPartition />
      </StatelessService>
    </Service>

    <Service Name="SystemStatusConsumerActorService" 
             GeneratedIdRef="faad4d24-04db-4e06-8a1d-22bc6255c7fe|Persisted">

      <StatefulService ServiceTypeName="SystemStatusConsumerActorServiceType" TargetReplicaSetSize="SystemStatusConsumerActorService_TargetReplicaSetSize]" MinReplicaSetSize="[SystemStatusConsumerActorService_MinReplicaSetSize]">

        <UniformInt64Partition 
          PartitionCount="[SystemStatusConsumerActorServicePartitionCount]" 
          LowKey="-9223372036854775808" 
          HighKey="9223372036854775807" />
      </StatefulService>
    </Service>
</DefaultServices>

Notice that the only differences are the parameter names:

HttpAPI_InstanceCount

changed to

HttpAPIInstanceCount

SystemStatusConsumerActorService_PartitionCount

changed to

SystemStatusConsumerActorServicePartitionCount

1

There are 1 answers

4
Zapo On

After trying a lot of different configurations there was still a problem. I found it after checking the git diff of my project.

My problem was that the ApplicationManifest parameters are overridden in the ApplicatonParameters\Local.5Node.xml (because I use 5 node local cluster) file in the service fabric application folder.

The tricky part was that even if I delete or comment an override of SystemStatusConsumerActorService_PartitionCount the studio adds one every time I build the application. The only solution to this was changing the name of the parameter in the ApplicationManifest.xml .

After changing the configuration according to the new facts the stateless service and the actor service start with the desired amount of instances and partitions respectively.

Of course 4 of the five stateless instances break but this is totally logical, considering the cluster "runs" on one machine.