We've got vertica server, many data sources (hadoop hive, postgres, some inner airflow dag pipelines), from which data come. There are several tables, which data sources are unknown. Responsible people have disappeared, no info in colfuence, jira, etc. Tables have fresh data, regularaly refreshing. Is there any way to trace process, some data to find linked server?
Actually dont know where to start. I am using pycharm to work with vertica.
Try a query against both the
query_requestsandload_streamssystem tables:If your table is
poc.tgtand you suspect it's filled with an INSERT, go:if your table's name is
with_arrayand you suspect it's populated by aCOPY, go:Look at the other columns in
query_requestsandload_streamsto see if you want to add other columns to your report - to find the populating process.Also - join the
user_sessionstable withquery_requestsusingsession_id- and see if columns fromuser_sessionsshed some more light on the matter.