How to Efficiently Poll for Data with React Query

449 views Asked by At

I have a database collection that can be updated by a large number of users in real-time. Users need to be able to see updates from other members, but the collection is large, so re-downloading the entire collection every time is massively inefficient/expensive.

Every document has a updateTime timestamp. Given this, what I want to do is poll for updated data as needed (either on component mount or at other frequencies) and merge that into the data already stored in cache with ReactQuery (using persistClientQuery).

I'm new to ReactQuery, so I'm wondering if there's a more efficient approach to than the one I'm using here, where I use React Query's useInfiniteQuery hook with the newest updateTime as the nextPageParam:

The query itself:

export function useTalentWithStore() {
    const queryClient = useQueryClient();

    const query = useInfiniteQuery<Talent[]>({
        queryKey: ['talent'],
        getNextPageParam: (lastPage, allPages) => {
            const data = allPages.flat();

            // Use the newest document's updateTime as the next page param
            const times = data.map((tal) => tal?.docMeta?.updateTime);
            if (data.length) {
                const max = Math.max(...times as any);
                return new Date(max);
            }
            return undefined;
        },
        queryFn: async ({ pageParam }) => {
            let talentQuery: firebase.firestore.CollectionReference<firebase.firestore.DocumentData> | firebase.firestore.Query<firebase.firestore.DocumentData>
                = firestore.collection("talent");

            // If the there's a page param, just get documents updated since then, otherwise, get everything
            if (pageParam) {
                talentQuery = talentQuery.where("docMeta.updateTime", ">", pageParam);
            }
            let talentSnapshot = await talentQuery.get();

            const talentUpdates: Talent[] = talentSnapshot.docs.map((doc) => {
                return {
                    id: doc.id,
                    ...doc.data()
                }
            });
            return talentUpdates;
        },
        staleTime: Infinity,
    });

    // Combine new data with any old data, and return a flat object 
    const flatData = useMemo<Talent[] | undefined>(() => {
        const oldData = query.data?.pages?.[0] || [];
        const newData = query.data?.pages?.[1] || [];

        const combinedData: Talent[] = [];
        if (oldData) {
            combinedData.push(...oldData);
        }
        for (const tal of newData) {
            const idx: number = combinedData.findIndex((t) => t.id === tal.id);
            if (idx >= 0) {
                combinedData[idx] = tal;
            } else {
                combinedData.push(tal);
            }
        }

        // If there's any old data, flush it out and replace it with the combined new data
        if (oldData.length) {
            queryClient.setQueryData(['talent'], (data: any) => ({
                pages: [combinedData],
                pageParams: query.data?.pageParams,
            }));
        }

        return combinedData;
    }, [query.data, queryClient]);

    return { ...query, flatData };
}

Example Usage:

const talentQuery = useTalentWithStore();
const talent = talentQuery.flatData;
const [fetchedOnMount, setFetchedOnMount] = useState(false);

useEffect(() => {
    if (!fetchedOnMount && !talentQuery.isFetching) {
        console.log(`Fetching New Talent`, !fetchedOnMount && !talentQuery.isFetching);
        talentQuery.fetchNextPage();
    }
    setFetchedOnMount(true);
}, [talentQuery, fetchedOnMount]);

Is all this really necessary, or does ReactQuery support this functionality natively?

If not, are there other approaches I should consider here or pitfalls I need to watch out for?

(Note: While this code uses Firestore, for various reasons, I don't want to use Firestore's real-time updates here)

2

There are 2 answers

0
Tim On

Answering this for anybody else who stumbles along!

There's a much simpler solution, which is basically to use a normal query and pass queryClient into the queryFn so you can use it to grab any old data and filter the query with the date/time of the most recent record.

If you set the staleTime to Infinity (as done previously) and set refetchOnMount to always, you'll get a query that saves the previous results, fetches any updates whenever the query remounts, and merges in those updates using whatever merge function you specify.

// Generic Query Function
async function getFreshCollection<T extends Doc>(queryKey: QueryKey, queryClient: QueryClient, collectionName: string) {
    // Use the old data to find the latest timestamp
    const oldData: Doc[] = queryClient.getQueryData(queryKey) || [];

    // This function can be whatever you need; see the question for my example
    const updatedSince = findLargestTimestamp(oldData);

    // Set up the query
    let firebaseQuery: firebase.firestore.CollectionReference<firebase.firestore.DocumentData> | firebase.firestore.Query<firebase.firestore.DocumentData>
        = firestore.collection(collectionName);
    
    // If the there's an updatedSince, add that to the query
    if (updatedSince) {
        firebaseQuery = firebaseQuery.where("docMeta.updateTime", ">", updatedSince);
    }

    // Actually run the query and map the data
    const snapshot = await firebaseQuery.get();
    const newData = snapshot.docs.map((doc) => {
        return { id: doc.id, ...doc.data() } as T
    });
    
    // Merge the new data with the old as needed; Simplest would be to just append the new data
    // But in my case updates could be a combo of some updates to previously fetched docs and some net new documents, 
    // so I need to overwrite old docs with newer updates and add in the new ones (see question for an example)
    return flattenData<T>(oldData, newData);
}

// Usage
export function useTalent() {
    const queryKey = ['talent'];
    const collectionName = 'talent';
    const queryClient = useQueryClient();

    const query = useQuery({
        queryKey: queryKey,
        queryFn: async () => getFreshCollection<Talent>(queryKey, queryClient, collectionName),
        staleTime: Infinity,
        refetchOnMount: 'always',
        onError: (err) => handleFreshQueryError(err, queryKey)
    });

    return query;
}
0
Djones4822 On

I have a similar situation, based on my research I think the best way to handle this is to store a dataFetch state value, and have the data query enabled based on that, and then have a statusFetch query that polls for changes. The status fetch should either hit a separate endpoint with minimal return, or use just a HEAD request and include the "last-modified" value in the head. Then you can inform react query whether or not it needs to fetch the latest data by flipping that dataFetch variable - and on completion you can set it back to false.

I think it is still preferable to keep reactQuery in the data fetching rather than moving the data fetch to a useEffect since there are lots of benefits besides the polling that React Query offers.

However, it would be nice if I could specify a single useQuery and set the pollingFn separate from the fetchFn. I'm curious what the tanstack team would say to me about that...