Is kafka suitable for Internet-use?
More precisely, what I want is to expose kafka topics as "public interface", then external consumers (or producers) can connect to it. Is it possible?
I hear there are problems if I want to use the cluster in both internal and external networks, because it is then hard to configure advertised.host.name. Is that true?
And do I have to expose zookeeper as well? I think the new consumer/producer api no longer need that.
Kafka's wire protocol is TCP-based and works fine over the public internet. In the latest versions of Kafka you can configure multiple interfaces for both internal and external traffic. Examples of Kafka over the internet in production include several Kafka-as-a-Service offerings from Heroku, IBM MessageHub, and Confluent Cloud.
You do not need to expose zookeeper if the Kafka clients use the new consumer API.
You may also choose to expose a REST Proxy such as the open source Confluent REST Proxy as a more client firewall friendly interface since it runs over HTTP(S) and will not be blocked by most corporate or personal firewalls.