Swarm tasks #241

jcmcote · 2017-09-09T00:04:55Z

No description provided.

ehazlett · 2017-09-09T03:54:06Z

Hey do you have time for a quick chat? I don't want you to get too far ahead to make sure we get aligned. I'm on the Docker Community slack (ehazlett). Thanks!

jcmcote · 2017-09-12T01:29:47Z

Hey Evan, I went to the Docker Community slack link you provided. I'm not sure how to register.. It says I can create an account if I have a @docker.com email address, but I don't have a such an email address.. ? Sorry I'm new to slack. thanks Jean-Claude

…

On Fri, Sep 8, 2017 at 11:54 PM, Evan Hazlett ***@***.***> wrote: Hey do you have time for a quick chat? I don't want you to get too far ahead to make sure we get aligned. I'm on the Docker Community <https://dockercommunity.slack.com> slack (ehazlett). Thanks! — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#241 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AJoEwir0KgZdwZPEOL5_hzhjd5XHnEiQks5sggvggaJpZM4PRz83> .

jcmcote · 2017-09-16T18:11:06Z

Hey Evan I'd like to discuss the swarm tasks interlock integration. However I don't have a @docker.com email address.. How can I register on the slack community you sent me? thanks jean-claude

…

On Mon, Sep 11, 2017 at 9:29 PM, Jean-Claude Cote ***@***.***> wrote: Hey Evan, I went to the Docker Community slack link you provided. I'm not sure how to register.. It says I can create an account if I have a @docker.com email address, but I don't have a such an email address.. ? Sorry I'm new to slack. thanks Jean-Claude On Fri, Sep 8, 2017 at 11:54 PM, Evan Hazlett ***@***.***> wrote: > Hey do you have time for a quick chat? I don't want you to get too far > ahead to make sure we get aligned. I'm on the Docker Community > <https://dockercommunity.slack.com> slack (ehazlett). Thanks! > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#241 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AJoEwir0KgZdwZPEOL5_hzhjd5XHnEiQks5sggvggaJpZM4PRz83> > . >

ehazlett · 2017-09-16T19:23:13Z

Yes anyone can sign up

…

On Sep 11, 2017 21:29, "jcmcote" ***@***.***> wrote: Hey Evan, I went to the Docker Community slack link you provided. I'm not sure how to register.. It says I can create an account if I have a @docker.com email address, but I don't have a such an email address.. ? Sorry I'm new to slack. thanks Jean-Claude On Fri, Sep 8, 2017 at 11:54 PM, Evan Hazlett ***@***.***> wrote: > Hey do you have time for a quick chat? I don't want you to get too far > ahead to make sure we get aligned. I'm on the Docker Community > <https://dockercommunity.slack.com> slack (ehazlett). Thanks! > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#241 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AJoEwir0KgZdwZPEOL5_ hzhjd5XHnEiQks5sggvggaJpZM4PRz83> > . > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#241 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAP6IvxkgcYU_u6UWU0xm2fQwaFIA27tks5shd6NgaJpZM4PRz83> .

jcmcote · 2017-10-23T19:10:37Z

I notice my implementation had an issue. It cannot detect when containers are stopped on other nodes. This is currently a limitation of the docker events. From the node you are listening from say the manager node you only receive container events from that manager not the other worker nodes.

To work around this issue I'm using the poller and doing a diff of the task states which I have access to from the manager node. This way I know when a container on a worker node is down.

This all works well. I'm now testing with two separate stacks deployed to the same swarm. Ideally I'd like to be able to deploy multiple stacks representing various staging branches from our build host so these stacks should work independently.

My question to you is: Would it be a good idea to use the stack membership to detect task changes and to generate the nginx configuration.

I know in a previous comment you said to make sure container IPs are only added to the nginx configuration if the given container and nginx are part of the same network.

evanhsu · 2018-01-20T09:06:18Z

It cannot detect when containers are stopped on other nodes. This is currently a limitation of the docker events. From the node you are listening from say the manager node you only receive container events from that manager not the other worker nodes.

I'm curious if you could listen for service events rather than container events on the swarm manager to determine when changes have been made on worker nodes.

Using the docker events command on a manager node, I'm able to see events on worker nodes:

root@mySwarm-01:~# docker events
2018-01-20T07:01:17.708901695Z service create unvc1mfit1qzmc9xppxyp25oz (name=hitcounter_web)

I guess you'll just have to take my word for it that the hitcounter_web service was created on a worker... here's the docker-compose.yml file that was deployed to the swarm to generate that event (using docker stack deploy):

version: '3'

services:
    web:
        image: futuredays/hitcounter
        build: .
        depends_on:
          - redis
        deploy:
          replicas: 1
          placement:
            constraints: [node.role == worker ]

    redis:
        image: redis:alpine
        deploy:
          replicas: 1
          placement:
            constraints: [node.role == worker ]

networks:
  default:
    external:
      name: nginx-proxy

For my use case, it would be desirable to have interlock add an entry to the nginx.conf that uses the service name rather than the container IP, like this (where 'hitcounter_web' is the service name from the stack defined above):

http {
  upstream hitcounter {
    server hitcounter_web:8000
  }

  server {
    listen 80;
    server_name www.domain.com;
    location / {
      proxy_pass http://hitcounter;
    }
  }
}

This would allow the swarm to handle load balancing without needing to know which node(s) the service is running on.

It looks like you've made lots of progress already on making this work with swarm mode across multiple nodes so it seems like this is the right place to ask. Would this be a viable strategy to get around Docker's swarm-events limitations?

ehazlett · 2018-01-20T22:41:12Z

For my use case, it would be desirable to have interlock add an entry to the nginx.conf that uses the service name rather than the container IP, like this (where 'hitcounter_web' is the service name from the stack defined above):

Just be aware that once you do this it will no longer be load balanced. There will be a single name that resolves to a VIP (unless you DNSRR which will then just be a random IP) and all load balancing features of NGINX will be lost. For example, if one of the upstreams stops responding Docker will still send the request because there is no upstream checking -- it will blindly send the request as it's L4.

ehazlett · 2018-01-20T22:44:39Z

FWIW service support is implemented in a separate private fork for Docker. I'm trying to convince the powers that be to open source it as I think it will help a lot of users using Swarm. I will at least publish some design docs to show the implementation.

jcmcote · 2018-01-21T13:47:24Z

I'm curious if you could listen for service events rather than container events on the swarm manager to determine when changes have been made on worker nodes.

@evanhsu This was also my initial idea however @ehazlett pointed out that this mechanism would not work for the haproxy load balancer. So I resolved the issue by polling for tasks.

This pull request does work and I've been waiting for @ehazlett to review it so it can be merged into the main branch. My team is using this branch however if it does not make it into the main branch soon there is a risk we might write our own version of interlock using Java (my teams expertise).

evanhsu · 2018-01-21T19:50:55Z

...this mechanism would not work for the haproxy load balancer.

I guess my thought is that using the HAproxy load balancer feels redundant when running docker services in swarm mode, but HAproxy is still useful as a reverse proxy in this case and it would be convenient to use interlock for automatic configuration of the proxy, but not necessarily for load balancing.

From https://docs.docker.com/engine/swarm/key-concepts/#load-balancing :

The swarm manager uses internal load balancing to distribute requests among services within the cluster based upon the DNS name of the service.

So when HAproxy performs load-balancing and routes a request to a specific node within the swarm (by IP), isn't that request then getting re-load-balanced by the swarm's load balancer, potentially resulting in that request being handled by a different node than the one HAproxy sent it to?

My understanding is that in order for HAproxy's load-balancer to work as expected when working with swarm services, the swarm services would need to be running in 'host' mode to avoid getting re-routed within the swarm:

From https://docs.docker.com/engine/swarm/ingress/#bypass-the-routing-mesh :

You can bypass the routing mesh, so that when you access the bound port on a given node, you are always accessing the instance of the service running on that node. This is referred to as host mode.

The complication with running swarm services in host mode (as I understand it) is that you can't run multiple replicas of the same service on the same node (for high availability) if your application expects to run on a specific port. This would result in multiple services trying to publish the same port on the same node. HAproxy would have no way of load balancing between these two replicas, unless you specify a unique port for each replica when scaling the service. But at that point, it seems like we're just trying to manually manage a swarm instead of using the features that Docker Swarm provides.

Is my understanding of this issue accurate, and would it make sense in light of these issues to have interlock configure HAproxy using service names rather than IP's?

Makefile: test-integration-jenkins must rebuild rttf

jcmcote added 5 commits September 1, 2017 12:19

Added support for Swarm tasks running on worker nodes

3e00b8e

Merge branch 'master' of https://github.com/ehazlett/interlock

b491af6

integrated ideas from the swarm-services branch

4208584

Removed SwarmTaskMode flag

edd7983

forgot to un-comment running-inside-docker check

42c9e81

jcmcote added 3 commits September 23, 2017 06:55

forgot to remove commented out code

8a958ca

fix import

26430b8

fix test cases

9e05ad4

poller checks when tasks state change and triggers and update

c5cb672

ehazlett added a commit that referenced this pull request Aug 16, 2019

Merge pull request #241 from euanh/fix-test-integration-jenkins-again

0cedde7

Makefile: test-integration-jenkins must rebuild rttf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Swarm tasks #241

Swarm tasks #241

jcmcote commented Sep 9, 2017

ehazlett commented Sep 9, 2017

jcmcote commented Sep 12, 2017 via email

jcmcote commented Sep 16, 2017 via email

ehazlett commented Sep 16, 2017 via email

jcmcote commented Oct 23, 2017

evanhsu commented Jan 20, 2018

ehazlett commented Jan 20, 2018

ehazlett commented Jan 20, 2018

jcmcote commented Jan 21, 2018 •

edited

Loading

evanhsu commented Jan 21, 2018

Swarm tasks #241

Are you sure you want to change the base?

Swarm tasks #241

Conversation

jcmcote commented Sep 9, 2017

ehazlett commented Sep 9, 2017

jcmcote commented Sep 12, 2017 via email

jcmcote commented Sep 16, 2017 via email

ehazlett commented Sep 16, 2017 via email

jcmcote commented Oct 23, 2017

evanhsu commented Jan 20, 2018

ehazlett commented Jan 20, 2018

ehazlett commented Jan 20, 2018

jcmcote commented Jan 21, 2018 • edited Loading

evanhsu commented Jan 21, 2018

jcmcote commented Jan 21, 2018 •

edited

Loading