Details
-
Bug
-
Status: Done
-
Medium
-
Resolution: Fixed
-
2.27.0
-
None
-
Yes
-
Yes
-
Yes
Description
Hello all,
I am trying implement PMM across several postgres databases, and discovered that they were multiple connections instead of reusing as follows. Causing lots of log entries in postgres logs when connection log is enabled.
# Too many connection attempts
tcp 0 0 127.0.0.1:60430 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60102 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60720 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60814 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60592 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:59822 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60840 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:59916 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60032 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60478 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:59928 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60808 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60052 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60130 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60562 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:59906 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60776 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60818 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:56006 127.0.0.1:5432 ESTABLISHED
tcp 0 0 127.0.0.1:60248 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:59870 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60318 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60218 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60668 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:59868 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60506 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60624 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60320 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60186 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60696 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60164 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60658 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60678 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60376 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60788 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60216 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60650 127.0.0.1:5432 TIME_WAIT
tcp 0 0 127.0.0.1:60598 127.0.0.1:5432 TIME_WAIT
And found out using autodiscover with unix socket does not properly connect to the DBs, dunno if it's possible to update the DATA_SOURCE_NAME manually. Or disabling it simply to avoid this.
Which causes a lots of log entries in system journal (pmm-agent logs)
May 1 03:32:05 meltdb01 pmm-agent[1648937]: #033[36mINFO#033[0m[2022-05-01T03:32:05.979+02:00] time="2022-05-01T03:32:05+02:00" level=info msg="Established new database connection to \"test1:5432\"." source="postgres_exporter.go:916" #033[36magentID#033[0m=/agent_id/10c30cee-3c8b-4950-b0de-ff5541eb190d #033[36mcomponent#033[0m=agent-process #033[36mtype#033[0m=postgres_exporter May 1 03:32:05 meltdb01 pmm-agent[1648937]: #033[36mINFO#033[0m[2022-05-01T03:32:05.979+02:00] time="2022-05-01T03:32:05+02:00" level=error msg="Error opening connection to database (postgres://pmm:PASSWORD_REMOVED@test1?connect_timeout=1&host=%!F(MISSING)var%!F(MISSING)run%!F(MISSING)postgresql&sslmode=disable): \"dial tcp: lookup test1 on [::1]:53: read udp [::1]:44429->[::1]:53: read: connection refused\": too many connection retries" source="postgres_exporter.go:1627" #033[36magentID#033[0m=/agent_id/10c30cee-3c8b-4950-b0de-ff5541eb190d #033[36mcomponent#033[0m=agent-process #033[36mtype#033[0m=postgres_exporter May 1 03:32:05 meltdb01 pmm-agent[1648937]: #033[36mINFO#033[0m[2022-05-01T03:32:05.979+02:00] time="2022-05-01T03:32:05+02:00" level=info msg="Established new database connection to \"test2:5432\"." source="postgres_exporter.go:916" #033[36magentID#033[0m=/agent_id/10c30cee-3c8b-4950-b0de-ff5541eb190d #033[36mcomponent#033[0m=agent-process #033[36mtype#033[0m=postgres_exporter May 1 03:32:05 meltdb01 pmm-agent[1648937]: #033[36mINFO#033[0m[2022-05-01T03:32:05.979+02:00] time="2022-05-01T03:32:05+02:00" level=error msg="Error opening connection to database (postgres://pmm:PASSWORD_REMOVED@test2?connect_timeout=1&host=%!F(MISSING)var%!F(MISSING)run%!F(MISSING)postgresql&sslmode=disable): \"dial tcp: lookup test2 on [::1]:53: read udp [::1]:46970->[::1]:53: read: connection refused\": too many connection retries" source="postgres_exporter.go:1627" #033[36magentID#033[0m=/agent_id/10c30cee-3c8b-4950-b0de-ff5541eb190d #033[36mcomponent#033[0m=agent-process #033[36mtype#033[0m=postgres_exporter
Also checked with the latest exporter from prometheus i do not see such behaviour when connecting to the DB connections remains stable. 1 connection per DB which seems ok.
Seems this exporter is tad behind than upstream.
Thanks a lot for this wonderful tool
Best Regards
Attachments
Issue Links
- relates to
-
PMM-7806 Upgrade postgres_exporter version used in pmm from 0.8.0 to 0.10.1
-
- Done
-