Не удалось присоединиться к кластеру Akka.NET

У меня проблема с присоединением и отладкой присоединения к кластеру Akka.NET. Я использую версию 1.3.8. Моя установка следующая:

Маяк

Почти дефолтный код с гитхаба. Выполняется в консоли akka.hocon следующим образом:

lighthouse {
  actorsystem: "sng"
}

petabridge.cmd{
    host = "0.0.0.0"
    port = 9110
}

akka {
  loglevel = DEBUG
  loggers = ["Akka.Logger.Serilog.SerilogLogger, Akka.Logger.Serilog"]
  actor {
    provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"
                  debug {
                  receive = on
                  autoreceive = on
                  lifecycle = on
                  event-stream = on
                  unhandled = on
              }
  }
  remote {
      log-sent-messages = on
      log-received-messages = on
      log-remote-lifecycle-events = on
        enabled-transports = ["akka.remote.dot-netty.tcp"]
    dot-netty.tcp {
      transport-class = "Akka.Remote.Transport.DotNetty.TcpTransport, Akka.Remote"
      applied-adapters = []
      transport-protocol = tcp
      hostname = "0.0.0.0"
      port = 4053
    }
    log-remote-lifecycle-events = DEBUG
  }            
  cluster {
    auto-down-unreachable-after = 5s
    seed-nodes = [] 
    roles = [lighthouse]
  }
}

Рабочий узел

Также консольное (net461) приложение с максимально простым запуском и присоединением. Он работает так, как ожидалось. акка.hocon:

akka {
  loglevel = DEBUG
  loggers = ["Akka.Logger.Serilog.SerilogLogger, Akka.Logger.Serilog"]
  actor {
    provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"
  }

  remote {
      log-sent-messages = on
      log-received-messages = on
      log-remote-lifecycle-events = on
    dot-netty.tcp {
      transport-class = "Akka.Remote.Transport.DotNetty.TcpTransport, Akka.Remote"
      applied-adapters = []
      transport-protocol = tcp
      hostname = "0.0.0.0"
      port = 0
    }
  }            

  cluster {
    auto-down-unreachable-after = 5s
    seed-nodes = ["akka.tcp://[email protected]:4053"] 
    roles = [monitor]
  }
}

Не рабочий узел

Библиотека .NET 4.6.1, зарегистрированная как COM и запущенная в другом приложении (Media Monkey) с кодом VBA:

Sub OnStartup
   Set o = CreateObject("MediaMonkey.Akka.Agent.MediaMonkeyAkkaProxy")
   o.Init(SDB)
End Sub

Система Акка, как и в консольной апликации, создана со стандартными ActorSystem.Create("sng", config);

акка.hocon:

akka {
  loglevel = DEBUG
  loggers = ["Akka.Logger.Serilog.SerilogLogger, Akka.Logger.Serilog"]
  actor {
    provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"
  }
  remote {
      log-sent-messages = on
      log-received-messages = on
      log-remote-lifecycle-events = on
    dot-netty.tcp {
      transport-class = "Akka.Remote.Transport.DotNetty.TcpTransport, Akka.Remote"
      applied-adapters = []
      transport-protocol = tcp
      hostname = "0.0.0.0"
      port = 0
    }
  }            
  cluster {
    auto-down-unreachable-after = 5s
    seed-nodes = ["akka.tcp://[email protected]:4053"] 
    roles = [mediamonkey]
  }
}

Рабочий процесс отладки

  1. Приложение Startup Lighthouse:

    Результат настройки: [Успешно] Имя sng.Lighthouse [Успешно] ServiceName sng.Lighthouse Topshelf v4.0.0.0, .NET Framework v4.0.30319.42000 [Lighthouse] ActorSystem: sng; IP: 127.0.0.1; ПОРТ: 4053 [Lighthouse] Выполнение предзагрузочной проверки работоспособности. Должна быть возможность разобрать адрес [akka.tcp://[email protected]:4053] [Маяк] Успешный анализ. [21:01:35 INF] Запуск удаленного взаимодействия [21:01:35 INF] Удаленное взаимодействие запущено; прослушивание адресов: [akka.tcp://[email protected]:4053] [21:01:35 INF] Удаленное взаимодействие теперь прослушивает адреса: [akka.tcp://[email protected]:4053] [21 :01:35 INF] Узел кластера [akka.tcp://[email protected]:4053] - Запуск... [21:01:35 INF] Узел кластера [akka.tcp://[email protected]. 0.1:4053] - Запущен успешно Теперь служба sng.Lighthouse запущена, нажмите Control+C, чтобы выйти. [21:01:35 INF] узел petabridge.cmd привязан к [0.0.0.0:9110] [21:01:35 INF] Узел [akka.tcp://[email protected]:4053] ПРИСОЕДИНЯЕТСЯ, роли [ маяк] [21:01:35 INF] Лидер перемещает узел [akka.tcp://[email protected]:4053] на [Вверх]

  2. Запустился и перестал работать консольный узел

Журналы маяка:

[21:05:40 INF] Node [akka.tcp://[email protected]:37516] is JOINING, roles [monitor]
[21:05:40 INF] Leader is moving node [akka.tcp://[email protected]:37516] to [Up]
[21:05:54 INF] Connection was reset by the remote peer. Channel [[::ffff:127.0.0.1]:4053->[::ffff:127.0.0.1]:37517](Id=1293c63a)
[21:05:54 INF] Message AckIdleCheckTimer from akka://sng/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fsng%400.0.0.0%3A37516-1/endpointWriter to akka://sng/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fsng%400.0.0.0%3A37516-1/endpointWriter was not delivered. 1 dead letters encountered.
[21:05:55 INF] Message GossipStatus from akka://sng/system/cluster/core/daemon to akka://sng/deadLetters was not delivered. 2 dead letters encountered.
[21:05:55 INF] Message Heartbeat from akka://sng/system/cluster/core/daemon/heartbeatSender to akka://sng/deadLetters was not delivered. 3 dead letters encountered.
[21:05:56 INF] Message GossipStatus from akka://sng/system/cluster/core/daemon to akka://sng/deadLetters was not delivered. 4 dead letters encountered.
[21:05:56 INF] Message Heartbeat from akka://sng/system/cluster/core/daemon/heartbeatSender to akka://sng/deadLetters was not delivered. 5 dead letters encountered.
[21:05:57 INF] Message GossipStatus from akka://sng/system/cluster/core/daemon to akka://sng/deadLetters was not delivered. 6 dead letters encountered.
[21:05:57 INF] Message Heartbeat from akka://sng/system/cluster/core/daemon/heartbeatSender to akka://sng/deadLetters was not delivered. 7 dead letters encountered.
[21:05:58 INF] Message GossipStatus from akka://sng/system/cluster/core/daemon to akka://sng/deadLetters was not delivered. 8 dead letters encountered.
[21:05:58 INF] Message Heartbeat from akka://sng/system/cluster/core/daemon/heartbeatSender to akka://sng/deadLetters was not delivered. 9 dead letters encountered.
[21:05:59 WRN] Cluster Node [akka.tcp://[email protected]:4053] - Marking node(s) as UNREACHABLE [Member(address = akka.tcp://[email protected]:37516, Uid=1060233119 status = Up, role=[monitor], upNumber=2)]. Node roles [lighthouse]
[21:06:01 WRN] AssociationError [akka.tcp://[email protected]:4053] -> akka.tcp://[email protected]:37516: Error [Association failed with akka.tcp://[email protected]:37516] []
[21:06:01 WRN] Tried to associate with unreachable remote address [akka.tcp://[email protected]:37516]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: [Association failed with akka.tcp://[email protected]:37516] Caused by: [System.AggregateException: One or more errors occurred. ---> Akka.Remote.Transport.InvalidAssociationException: No connection could be made because the target machine actively refused it tcp://[email protected]:37516
   at Akka.Remote.Transport.DotNetty.TcpTransport.<AssociateInternal>d__1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Akka.Remote.Transport.DotNetty.DotNettyTransport.<Associate>d__22.MoveNext()
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
   at Akka.Remote.Transport.ProtocolStateActor.<>c.<InitializeFSM>b__11_54(Task`1 result)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.Tasks.Task.Execute()
---> (Inner Exception #0) Akka.Remote.Transport.InvalidAssociationException: No connection could be made because the target machine actively refused it tcp://[email protected]:37516
   at Akka.Remote.Transport.DotNetty.TcpTransport.<AssociateInternal>d__1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Akka.Remote.Transport.DotNetty.DotNettyTransport.<Associate>d__22.MoveNext()<---
]
[21:06:04 INF] Cluster Node [akka.tcp://[email protected]:4053] - Leader is auto-downing unreachable node [akka.tcp://[email protected]:4053]
[21:06:04 INF] Marking unreachable node [akka.tcp://[email protected]:37516] as [Down]
[21:06:05 INF] Leader is removing unreachable node [akka.tcp://[email protected]:37516]
[21:06:05 WRN] Association to [akka.tcp://[email protected]:37516] having UID [1060233119] is irrecoverably failed. UID is now quarantined and all messages to this UID will be delivered to dead letters. Remote actorsystem must be restarted to recover from this situation.

Логи рабочего узла:

[21:05:38 INF] Starting remoting
[21:05:38 INF] Remoting started; listening on addresses : [akka.tcp://[email protected]:37516]
[21:05:38 INF] Remoting now listens on addresses: [akka.tcp://[email protected]:37516]
[21:05:38 INF] Cluster Node [akka.tcp://[email protected]:37516] - Starting up...
[21:05:38 INF] Cluster Node [akka.tcp://[email protected]:37516] - Started up successfully
[21:05:40 INF] Welcome from [akka.tcp://[email protected]:4053]
[21:05:40 INF] Member is Up: Member(address = akka.tcp://[email protected]:4053, Uid=439782041 status = Up, role=[lighthouse], upNumber=1)
[21:05:40 INF] Member is Up: Member(address = akka.tcp://[email protected]:37516, Uid=1060233119 status = Up, role=[monitor], upNumber=2)
//shutdown logs are missing
  1. Запущенный и остановленный COM-узел

Журналы маяка:

[21:12:02 INF] Connection was reset by the remote peer. Channel [::ffff:127.0.0.1]:4053->[::ffff:127.0.0.1]:37546](Id=4ca91e15)

Журналы COM-узла:

[WARNING][18. 07. 2018 19:11:15][Thread 0001][ActorSystem(sng)] The type name for serializer 'hyperion' did not resolve to an actual Type: 'Akka.Serialization.HyperionSerializer, Akka.Serialization.Hyperion'
[WARNING][18. 07. 2018 19:11:15][Thread 0001][ActorSystem(sng)] Serialization binding to non existing serializer: 'hyperion'
[21:11:15 DBG] Logger log1-SerilogLogger [SerilogLogger] started
[21:11:15 DBG] StandardOutLogger being removed
[21:11:15 DBG] Default Loggers started
[21:11:15 INF] Starting remoting
[21:11:15 DBG] Starting prune timer for endpoint manager...
[21:11:15 INF] Remoting started; listening on addresses : [akka.tcp://[email protected]:37543]
[21:11:15 INF] Remoting now listens on addresses: [akka.tcp://[email protected]:37543]
[21:11:15 INF] Cluster Node [akka.tcp://[email protected]:37543] - Starting up...
[21:11:15 INF] Cluster Node [akka.tcp://[email protected]:37543] - Started up successfully
[21:11:15 DBG] [Uninitialized] Received Akka.Cluster.InternalClusterAction+Subscribe
[21:11:15 DBG] [Uninitialized] Received Akka.Cluster.InternalClusterAction+Subscribe
[21:11:16 DBG] [Uninitialized] Received Akka.Cluster.InternalClusterAction+JoinSeedNodes
[21:11:16 DBG] [Uninitialized] Received Akka.Cluster.InternalClusterAction+Subscribe
[21:11:26 WRN] Couldn't join seed nodes after [2] attempts, will try again. seed-nodes=[akka.tcp://[email protected]:4053]
[21:11:31 WRN] Couldn't join seed nodes after [3] attempts, will try again. seed-nodes=[akka.tcp://[email protected]:4053]
[21:11:36 WRN] Couldn't join seed nodes after [4] attempts, will try again. seed-nodes=[akka.tcp://[email protected]:4053]
[21:11:40 ERR] No response from remote. Handshake timed out or transport failure detector triggered.
[21:11:40 WRN] AssociationError [akka.tcp://[email protected]:37543] -> akka.tcp://[email protected]:4053: Error [Association failed with akka.tcp://[email protected]:4053] []
[21:11:40 WRN] Tried to associate with unreachable remote address [akka.tcp://[email protected]:4053]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: [Association failed with akka.tcp://[email protected]:4053] Caused by: [Akka.Remote.Transport.AkkaProtocolException: No response from remote. Handshake timed out or transport failure detector triggered.
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Akka.Remote.Transport.AkkaProtocolTransport.<Associate>d__19.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Akka.Remote.EndpointWriter.<AssociateAsync>d__23.MoveNext()]
[21:11:40 DBG] Disassociated [akka.tcp://[email protected]:37543] -> akka.tcp://[email protected]:4053
[21:11:40 INF] Message InitJoin from akka://sng/system/cluster/core/daemon/joinSeedNodeProcess-1 to akka://sng/deadLetters was not delivered. 1 dead letters encountered.
[21:11:40 INF] Message InitJoin from akka://sng/system/cluster/core/daemon/joinSeedNodeProcess-1 to akka://sng/deadLetters was not delivered. 2 dead letters encountered.
[21:11:40 INF] Message InitJoin from akka://sng/system/cluster/core/daemon/joinSeedNodeProcess-1 to akka://sng/deadLetters was not delivered. 3 dead letters encountered.
[21:11:40 INF] Message InitJoin from akka://sng/system/cluster/core/daemon/joinSeedNodeProcess-1 to akka://sng/deadLetters was not delivered. 4 dead letters encountered.
[21:11:40 INF] Message InitJoin from akka://sng/system/cluster/core/daemon/joinSeedNodeProcess-1 to akka://sng/deadLetters was not delivered. 5 dead letters encountered.
[21:11:40 INF] Message AckIdleCheckTimer from akka://sng/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fsng%40127.0.0.1%3A4053-1/endpointWriter to akka://sng/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2Fsng%40127.0.0.1%3A4053-1/endpointWriter was not delivered. 6 dead letters encountered.
[21:11:41 WRN] Couldn't join seed nodes after [5] attempts, will try again. seed-nodes=[akka.tcp://[email protected]:4053]
[21:11:41 INF] Message InitJoin from akka://sng/system/cluster/core/daemon/joinSeedNodeProcess-1 to akka://sng/deadLetters was not delivered. 7 dead letters encountered.
[21:11:46 WRN] Couldn't join seed nodes after [6] attempts, will try again. seed-nodes=[akka.tcp://[email protected]:4053]
[21:11:51 WRN] Couldn't join seed nodes after [7] attempts, will try again. seed-nodes=[akka.tcp://[email protected]:4053]

У вас есть идеи, как отладить и/или решить эту проблему?


person Rok    schedule 18.07.2018    source источник


Ответы (1)


Как я вижу, первое, что я замечаю в неработающем узле, конфигурация hocon содержит другой адрес «seed-nodes» от рабочего узла.

ИМХО, «начальные узлы» во всех приложениях [узлы, вызываемые в кластере] внутри кластера должны быть одинаковыми. Так что в нерабочем узле вместо

seed-nodes = ["akka.tcp://[email protected]:4053"] 

заменить на ниже, который находится в рабочем узле

seed-nodes = ["akka.tcp://[email protected]:4053"] 

Кроме того, проверьте ссылку на github для образца https://github.com/AJEETX/Akka.Cluster

и еще одна ссылка https://github.com/AJEETX/AkkaNet.Cluster.RoundRobinGroup

@Rok, пожалуйста, дайте мне знать, было ли это полезно, или я могу продолжить расследование.

person Ajeet Kumar    schedule 25.07.2018
comment
Спасибо за ответ. У вас хороший глаз, но, к сожалению, это была моя опечатка в этом посте. Все мои конфиги имеют одно и то же имя системы актеров. И соединения (с точки зрения брандмауэра) действительно происходят, потому что, когда я завершаю COM-узел, маяк говорит, что соединение было сброшено удаленным узлом. Есть одна вещь, которую я заметил. COM-узел прослушивает порт 37543, но маяк подключен к COM-узлу через порт 37546. Может быть что-то с этим? - person Rok; 25.07.2018
comment
Еще одна интересная вещь. Я загрузил все исходники, необходимые для Akka (Akka, DotNetty,...), и после того, как я заменил их самородком, я успешно присоединился к кластеру. Что ж, я обновил фреймворки (net45 -> net461 и netstandard1.6->netstandard2.0) для целей консолидации nuget, потому что у меня проблемы с загрузкой сборки в среде COM. - person Rok; 25.07.2018
comment
@RokB, с логами я не смог разобраться. Может быть, если бы вы разместили код где-нибудь, например, на github. Потом попробую воспроизвести и посмотреть. - person Ajeet Kumar; 26.07.2018