Use SQL to Query Data from CDS and Dynamics 365 CE
from : https://powerobjects.com/2020/05/20/use-sql-to-query-data-from-cds-and-dynamics-365-ce/
Have you ever wanted to use T-SQL to query Dynamics 365 Customer Engagement (CE) or CDS data from a cloud-hosted environment, without having to replicate your data using the Data Export Service (DES) or some other replication solution? Of course you have! With the upcoming 2020 release Wave 1, Microsoft is rolling out this capability to all of us eager CRM data nerds. There are some steps involved to set things up, and we will go over those today.
First, a brief look at the long sought-after endgame here: writing and executing a SQL query against a live CDS environment! There are other ways we could connect to this new endpoint besides a query in SQL Server Management Studio (like via .NET C#), however, SSMS allows for easy validation of the feature setup.
Prerequisites and Caveats
1. Your CDS environment version MUST be 9.1.0.17437 or higher. This is needed to enable the feature on the backend. At time of writing, this feature is still in Preview, so take that into account when considering whether to use this in your production environments. To easily determine the version of your current CDS environment, login to it, click the Settings gear in the upper-right corner, and click About. If you’re unsure why you’re on a specific version, it is often helpful to check the Microsoft Dynamics 365 CE release train documentation for information on when to expect version availability in your region.
2. You will need a way to modify your OrgDBOrgSettings feature flags. This is typically done with the Microsoft OrgDBOrgSettings Tool (select CRM2016-Tools-KB4046795-ENU-amd64.exe for the download), but can be done in various other ways as well, like the excellent OrgDBOrgSettings managed solution from Sean McNellis which has also been updated to support this new feature flag, which is ultimately what was used here.
3. The Microsoft documentation states you’ll need SQL Server Management Studio (SSMS) version 18.4 or greater to be able to connect to the CDS database, and you should ideally go with that recommendation. That said, we’ve been able to connect successfully from SSMS version 17.9.1.
4. This is only for querying data from a CDS database (read-only). You will not be able to use it to do inserts or updates.
5. Security: Note that your read security permissions in CRM carry over to SQL – if you do not have read-access for the Account entity for example, you cannot query it from the CDS SQL endpoint.
Setup in 3 Steps
1. Verify the current version of your CDS environment as mentioned in the prerequisites. Once you have verified that your CDS environment is on version 9.1.0.17437 or later, you can proceed. If you’re not yet on that version or later, turn back now because you will not be able to enable this feature.
- If you don’t have this version or newer available for your CDS environment, check the Microsoft Dynamics 365 CE release train documentation for information on when to expect version availability in your region.
2. Next, you’ll need to enable the Tabular Data Stream (TDS) endpoint for CDS for your environment. This can be done via the OrgDBOrgSettings Tool from Microsoft, the OrgDBOrgSettings managed solution from Sean McNellis, both of which currently support editing this new feature flag (you can find links to these in the prerequisites section). The feature flag we need to enable is called ‘EnableTDSEndpoint’.
- Using the OrgDBOrgSettings Tool: there’s a great write-up from Microsoft on how to do this, however, we would consistently hit this error when trying to run the final step, as have others in the community: ‘Error occurred in OrgDBOrgSettings and the error details are GDS resource provider is unable to get instances for tenantId:…’.
- To get around this error, the OrgDBOrgSettings managed solution was used instead. Simply install the managed solution and open the solution configuration page. You will see a full list of the Org DB settings you can change – look for the one named EnableTDSEndpoint.
- Click the Add link for the EnableTDSEndpoint row, then click Edit, and set the value to be true and click Update. The result should be a row that looks like this:
3. Now you should be able to connect to your environment from SQL Server Management Studio (SSMS)!
- As stated in the prerequisites, you will need to have SSMS version 18.4 or later to connect to a CDS database.
- To connect, simply place your org URL in the Server Name field, and add ‘,5558’ to the end. So, if your org URL is dancat.crm.dynamics.com, your connection Server Name field should be dancat.crm.dynamics.com,5558.
- Set your Authentication drop-down to be ‘Azure Active Directory – Password’. Currently this is the only way you can connect to a CDS database from SSMS – you will not be able to use Windows Authentication or SQL Authentication. Enter your User name and Password, and click Connect.
Note: If you get the following error shown below: “Login failed: TDS protocol endpoint is disabled for this organization…”, you need to revisit step 2, as your EnableTDSEndpoint flag is not set correctly.
Querying the CDS Database
If you’ve done your setup correctly, when you connect to the org via SSMS (step 3 above), you should see the org DB appear on the left in the Object Explorer. For those unfamiliar, you can expand the top-level org URL, then expand the Databases section, wherein you can see your CDS database. If you expand the Tables folder in there, you can see your CDS entity tables, just as if you were connected to an on-premises CRM SQL database!
Per Microsoft, the following operations are supported (but there are more that do function outside this list): SELECT, UNION, JOIN, FILTER, batch operations, and aggregate operations like COUNT() and MIN() or MAX(). The limitation that FetchXML has for a max of 50,000 aggregate rows is gone when we use T-SQL here as well, which is great! As mentioned before, this is a read-only database, so any operation that would need write permissions to the database will not work.
Here’s a simply query, and the expected result:
select a.accountid, a.accountnumber, case when a.accountnumber is not null then 1 when a.accountnumber is null then 0 end as test_case from dbo.account a (nolock)
Performance
One would expect that T-SQL queries should perform much better and faster than their FetchXML counterparts when it comes to querying from CDS, as we should be circumventing entire process of a FetchXML query being translated to a T-SQL query on the backend.
To test this, we loaded 100,000 rows into the Account table of our CDS environment, with only a few fields of data populated as a simple performance test. To test FetchXML query performance against a large set of records, we used version 1.2019.12.1 of the FetchXML Builder add-on for the XrmToolBox application. For the T-SQL query performance, SSMS version 18.5.
Test 1: Query 3 Attributes for all Accounts
Results (mean of 5 test runs for each query):
FetchXML: 16.97 seconds T-SQL: 5.17 seconds Average Improvement: 11.8 seconds
Test 2: Query 10 Attributes for all Accounts with a Join to Users
Results (mean of 5 test runs for each query):
FetchXML: 44.62 seconds T-SQL: 83.51 seconds Average Improvement: -38.89 seconds
What do these results mean? Well, first of all, we must remember that this test is being run on a trial org in the India region due to current feature availability. That means the org will not have the same kind of resource allocation that a production environment would, so we would expect more complex queries to perform faster than they are performing here in a real-world situation. Other entity tables would perform differently than the ‘heavier’ Account entity as well. The inflection point where the number of columns began to make the Account T-SQL query slower than the FetchXML query was at 6 columns. Doing a query of all Accounts and all Columns (bad query, good load test) will hit the 2-minute timeout every time in this environment. Your mileage may absolutely vary here, so you’ll want to performance test your own specific use-cases in your environment to confirm which path may be better for now. It is likely that with the GA release of this feature and continued developments, most if not all queries against the CDS SQL endpoint should be faster than FetchXML.
Limitations
This is very new feature (in Preview still at time of writing), so there are currently several limitations you need to consider before using this solution. Here is our list of currently known or discovered limitations with this feature:
- Not every table from the CDS environment is usable with this feature. Tables like audit and plugintracelog are not available for querying here. A pleasant surprise for anyone who has used the Data Export Service to replicate a CDS database to Azure SQL will note that the dbo.activitypointer entity is present in this new CDS DB query feature, even though it is not supported for DES.
- Not every attribute from the CDS environment tables is usable with this feature. Per Microsoft documentation, data types of binary, image, ntext, sql_variant, varbinary, virtual, HierarchyId, managedproperty, file, xml, partylist, and timestamp are not usable within a CDS SQL query at this time. Perhaps the most notable one from that list is ntext. Without support for ntext, you will not see data for Multiple Line of Text fields in your queries. In fact, the system appears to be working a bit oddly as of now – if you were to look at the native Account entity, you can see it has a Description field which would be a SQL data type of ntext. In the expanded Columns list for the dbo.account entity in SSMS, it doesn’t show the Description field… that makes sense, right? Except if you do a SELECT description FROM dbo.account then it returns the column, but with NULL data regardless of what is really in the field.
- Some syntax just doesn’t work yet. Unaliased queries behave oddly in the CDS SQL query feature – if we were to run this query, it works:
select s.fullname, s.systemuserid, s.createdon, a.accountid from dbo.systemuser s (nolock) left join dbo.account a (nolock) on s.systemuserid = a.ownerid
However, if we run this query without aliases, it fails:
select * from dbo.systemuser (nolock) left join dbo.account (nolock) on dbo.systemuser.systemuserid= dbo.account.ownerid
- Common Table Expressions (CTE) are not supported yet. This is a big one, but one we’d expect to see added to the functionality eventually due to how valuable CTEs can be for CRM data.
- Possible Security/Other Implications for Retrieve and RetrieveMultiple usage. From our testing, if you can login to CRM, you can connect to the read-only CDS SQL database. Your security permissions carry over to CDS SQL for the most part, in that if you can’t read entity records in the CRM UI, you won’t be able to read them from the CDS SQL endpoint. There’s a slight gap, however, which shouldn’t affect many implementations, but it is important for some with regard to data security – if you have logic that fires on Retrieve or RetrieveMultiple, that logic WILL NOT FIREwhen the same data is queried from the CDS SQL endpoint as of now.
- For example, if a user has read-access to an entity, but that entity has a synchronous plug-in running on RetrieveMultiple of that entity to blank-out the return payload, a user could get around that with the CDS SQL connection as of now. There are theories on a new message type that a plug-in could tie into in order to have similar interception logic on a T-SQL query for an entity, however, that has not yet been documented or confirmed at this time.
- There may be a way to block CDS SQL endpoint use for specific users via security roles, however, that has not been identified or documented yet.
- The 2-minute timeout is still a thing! Under default conditions, the CDS SQL endpoint will not tolerate long-running queries. If a T-SQL query runs longer than 120 seconds, whether due to malformed queries or environment limitations, the query will fail with the error message: “The request channel timed out while waiting for a reply after 00:02:00. Increase the timeout value passed to the call to Request or increase the SendTimeout value on the Binding. The time allotted to this operation may have been a portion of a longer timeout.” Take this into consideration when using the CDS SQL endpoint, and as with other queries, make them as conservatively as you can, remembering that a CRM query that runs for 1 minute normally may take 2 minutes or more under different load conditions.
- You’ll be reconnecting in SSMS relatively often. This may be a good security feature ultimately, but SSMS will disconnect you from the CDS SQL endpoint if there’s inactivity for what appears to be 10 minutes. This may be configurable, but that is unknown at this time.
Conclusion
The CDS SQL endpoint is a feature many of us have been waiting for since the inception of CRM being in the cloud back in the days when we referred to the product simply as CRM Online. Now with Dynamics 365 CE/CDS, this feature is most welcome, and while it has certain limitations currently, we expect many of those limitations to be resolved as the product feature matures. As always, don’t hesitate to reach out to us if you have questions – we’re glad to help!