If you work with Power BI, you have at least entertained using DirectQuery mode for your data sources. As opposed to the Import mode which pre-loads all your sources into the Power BI model, DirectQuery essentially allows you to query the underlying data source on demand, every time you render a visualization. Data is loaded only when requested and the most recent data is displayed.
This low latency comes at a cost though: since Power BI is an online service, data has to travel from your source to the cloud in order to render each visualization on your dashboard. Furthermore, at this time you cannot combine DirectQuery sources with Import Data sources; each visualization on your dashboard will generate a live query, thus making it difficult to ensure reasonable overall performance for a data rich dashboard. So DirectQuery is suitable for simpler, high level dashboards where data needs to be near real-time. A good example would be a Call Center live dashboard on a one-minute refresh cycle.
Here’s the real issue though: the list of supported data sources in DirectQuery mode is rather short at this time, and limited to database technologies (SQL Server, Oracle, Teradata, SAP Hana, and a few more). Tragically, Salesforce is not in the list. (This post would be a lot shorter if it were, right?) But what if your requirement is to monitor in near real-time the number of Opportunities or Tasks in Salesforce? This post introduces an indirect method for enabling DirectQuery for Salesforce, and as you will see later, most any other source for which an ODBC driver is available.
(To refresh your memory, an ODBC driver allows you to work with unconventional sources is a familiar, “SQL” way. In the case of Salesforce, the ODBC driver is a wrapper around one of the Salesforce APIs, and there is quite a few of them including REST, SOAP, Bulk, and Streaming. Some ODBC drivers use two or more of these interfaces, depending on the task.)
At a high level, the process we will implement consists of the following steps:
- Identify and install an ODBC driver for Salesforce;
- Create a linked server based on an ODBC data source;
- Create SQL views for objects in the linked server;
- Use Power BI DirectQuery with a SQL Server data source to access the views.
So let’s get started. There are a few ODBC drivers for Salesforce on the market, some better than others. For this example, I am going to use the Data Direct ODBC driver for Salesforce, developed by Progress, a reputable company in this field.
After installing the driver, we need create a 64 bit ODBC data source. Note that it is important to create a System data source, rather than a User one. The settings are pretty straight forward; you will need your Salesforce URL, username and password to test it. Each driver will have its own tweaks and quirks, but those are beyond the scope of this post.
Fig 1. An ODBC Data Source based on the DataDirect Driver
Once we have the data source, it is time to create the linked server. This step is also pretty straightforward, just select the DSN from the previous step and then go on the Security tab and check the “Connection will be made using this security context.” box; use your Salesforce username and password in the required fields.
Fig 2. Linked server security settings
Next, let’s create a view against the linked server in a database of your choice, using OPENQUERY to ensure better performance. Your view may look like this:
CREATE VIEW [dbo].[vw_SF_Opportunity} AS
SELECT * FROM OPENQUERY (SFDEV,'
We can now open Power BI, start a new project and to get data from our SQL Server. Make sure you select DirectQuery (that was the point of this whole exercise, remember?)
Fig 3. Using DirectQuery with SQL Server database
Finally, we can create a simple visualization from the view we just created. After we deploy this to Power BI Online, every time the user requests this visualization on a dashboard, a live query runs against Salesforce and returns the most recent data.
Fig 4. (Near) Real-Time Salesforce visualization
That’s all there is to it. We saw how to create a linked server using an ODBC driver for our unsupported sources, and use DirectQuery to bring in near real-time data into Power BI. Word to the wise: ODBC drivers come at a cost (there are some open source options out there but I have not tested them). Also, Salesforce imposes a limit on the number of API calls you make per day; be mindful of when it makes sense to use this method versus the more traditional asynchronous/scheduled load using the native Power BI Salesforce connector.