Golang: healthd and healthtop of the library "gocraft/health"

188 views Asked by At

Im using gocraft/health to check the health of my service and have the metrics of each endPoint. Im usin The JSON polling sink to get the metrics.

sink := health.NewJsonPollingSink(time.Minute*5, time.Minute*5)
stream.AddSink(sink)

I want to use healthtop and healthd here Link they explain how.

I fixed the environment variables: export HEALTHD_MONITORED_HOSTPORTS=:5001 HEALTHD_SERVER_HOSTPORT=:5002 healthd as they said

after they said "Now you can run it". how, they didn't give any command to do it.I didn't realy understand what they mean.

I navigated to src/github.com/gocraft/health/cmd/healthd. I found main.go when I run it I got that in the console

    [openrtb@sd-69536 healthd]$ go run main.go
    [2015-06-17T23:04:20.871743758Z]: job:general event:starting kvs:[health_host_port::5002 monitored_host_ports::5001,:5002 server_host_port::5002]
    [2015-06-17T23:04:20.87810814Z]: job:poll status:success time:4 ms kvs:[host_port::5002]
    [2015-06-17T23:04:20.881896459Z]: job:poll status:success time:8 ms kvs:[host_port::5001]
    [2015-06-17T23:04:20.882338024Z]: job:recalculate status:success time:231 μs
    [2015-06-17T23:04:23.275370787Z]: job:recalculate status:success time:6 μs
    [2015-06-17T23:04:30.875230839Z]: job:poll status:success time:1573 μs kvs:[host_port::5002]
    [2015-06-17T23:04:30.881415193Z]: job:poll status:success time:7 ms kvs:[host_port::5001]
.
.

but no reslute on the those endpoints

localhost:5002/jobs: Lists top jobs

localhost:5002/hosts: Lists all monitored hosts and their statuses

it gave me {"error": "not_found"}

excepte this localhost:5002/health I got this JSON responce

{
    "instance_id": "sd-69536.1291",
    "interval_duration": 3600000000000,
    "aggregations": [
        {
            "interval_start": "2015-06-18T01:00:00+02:00",
            "serial_number": 48,
            "jobs": {
                "general": {
                    "timers": {},
                    "events": {
                        "starting": 1
                    },
                    "event_errs": {},
                    "count": 0,
                    "nanos_sum": 0,
                    "nanos_sum_squares": 0,
                    "nanos_min": 0,
                    "nanos_max": 0,
                    "count_success": 0,
                    "count_validation_error": 0,
                    "count_panic": 0,
                    "count_error": 0,
                    "count_junk": 0
                },
                "poll": {
                    "timers": {},
                    "events": {},
                    "event_errs": {},
                    "count": 24,
                    "nanos_sum": 107049159,
                    "nanos_sum_squares": 6.06770682813009e+14,
                    "nanos_min": 1581783,
                    "nanos_max": 8259442,
                    "count_success": 24,
                    "count_validation_error": 0,
                    "count_panic": 0,
                    "count_error": 0,
                    "count_junk": 0
                },
                "recalculate": {
                    "timers": {},
                    "events": {},
                    "event_errs": {},
                    "count": 23,
                    "nanos_sum": 3501601,
                    "nanos_sum_squares": 6.75958305123e+11,
                    "nanos_min": 70639,
                    "nanos_max": 290877,
                    "count_success": 23,
                    "count_validation_error": 0,
                    "count_panic": 0,
                    "count_error": 0,
                    "count_junk": 0
                }
            },
            "timers": {},
            "events": {
                "starting": 1
            },
            "event_errs": {}
        }
    ]
}

but no idea what this result mean, because it doesn't have any relation with my
localhost:5001/health EndPoint that should normaly aggregate as they said.

1

There are 1 answers

6
evanmcdonnal On

What you downloaded is a binary so you can just invoke it with healthd if you're in the correct directory, they actually provide this example;

HEALTHD_MONITORED_HOSTPORTS=:5020 HEALTHD_SERVER_HOSTPORT=:5032 healthd

Which isn't setting env var as much as invoking healthd with those two values (export or something would be required to persist the change beyond the one command). healthtop more clearly states what it is but as you can see by their paths, they're both commands gocraft/health/cmd/healthtop. They have several examples of using healthtop from bash, not so explicit about healthd but it's the same.

If you ran that command (as you show in your question) then you may want to try healthtop jobs or something to that effect. I don't know a ton about this project and don't care to research it but from what I can tell healthd is just a service that collects results from various /health endpoints and makes them available in on API. It seems like they intend for you to use healthtop to on top of it to view reports.

Also note this;

Great! To get a sense of the type of data healthd serves, you can manually navigate to: /jobs: Lists top jobs /aggregations: Provides a time series of aggregations /aggregations/overall: Squishes all time series aggregations into one aggregation. /hosts: Lists all monitored hosts and their statuses. However, viewing raw JSON is just to give you a sense of the data. See the next section...

I'm not sure what the domain is (localhost:5032 if you're running locally?) but you should probably just be able to go to localhost:5032/jobs and see the healthd is running and doing something. Also check your apps to confirm it's up and running. Don't expect any output from it directly, that's what healthtop is for.