Beatbox: How do I add a WHERE clause when pulling data from SFDC?

141 views Asked by At

In Pandas, I am creating a dataframe that merges data from two different Beatbox queries. First, I pull all my Opportunity data, then I pull all my Account data, and then I merge.

However I would like to optimize this process by only pulling data for account['ID'] that exists in the oppty['AccountID'] column, as opposed to pulling the entirety of the Account data before a merge. How do I do this?

Current State:

query_result = svc.query("SELECT ID, AccountID FROM Opportunity")
records = query_result['records']
oppty = pd.DataFrame(records)

query_result = svc.query("SELECT ID, Website FROM Account")
records = query_result['records']
account = pd.DataFrame(records)

mylist = pd.merge(oppty, account, left_on='AccountID', right_on='ID', how='left')
2

There are 2 answers

2
superfell On BEST ANSWER

You can use a SOQL semi-join to restrict the Account query to only those accounts with opportunities, e.g.

svc.query("SELECT ID,Website FROM Account where ID in (SELECT accountId FROM Opportunity)")
2
holdenweb On

Since I don't have access to your svc object this is just a suggestion. But try

query_result = svc.query("""SELECT ID, AccountID FROM Opportunity
                            JOIN Account on Account.ID = Opportunity.AccountID""")

as a single query.

This should extract the data using a left inner join, which omits unmatched rows in both tables. It also does the join in SQL, reducing the amount of database traffic (and therefore also network bandwidth) by having the database server do the work, thereby reducing the computational load on your desktop client system.