I was able to attend the first ever Microsoft Fabric Community Conference last week. There were a ton of exciting product announcements, and this was my first big glimpse into the world of Fabric. Here’s what I learned.
What Is Microsoft Fabric?
Microsoft Fabric is a new platform that combines several of the data and analytics tools from the Microsoft BI Stack into a single unified platform.
You’ll see that this platform includes Data Factory, Synapse Data Engineering, Synapse Data Warehousing, Synapse Data Science, Synapse Real Time Analytics, and Power BI. Data Activator is also on the list, but was not included in the above image.
Many of the things you can do in Fabric are things you could do today with other tools. For example, you can use Azure Data Factory or Synapse to move data from your source into a data lake, and then build a Power BI Report on top of that data. However, Fabric promises to make it easier. The biggest piece of magic that brings this all together is OneLake.
OneLake? What’s that?
OneLake is the simplified version of a data lake. They are positioning this as “OneDrive for Data”. You only get one per tenant, and it’s organized the same way as your workspaces in Power BI – so your existing Power BI investments will continue to pay off here. But the beauty here is you don’t really have to think about it – you just use the tools you want to use from Fabric, select OneLake as your destination, and it will write the data out to OneLake in a Delta Parquet format. This allows several things.
- Files can be queried as though they were a database. If you like databases, you can use the Warehouse concept and treat it like a database. Under the covers, it will be files in OneLake.
- If you’re more of a Spark person, you can use a Lakehouse and treat it like a data lake.
- There are options to mix and match between these – if you have a Lakehouse, you can query it via SQL.
- When it’s time to report on it, you can use DirectLake mode to query the data directly – no Power BI report refresh needed!
Why Should I Use Fabric Instead of Building It Myself?
A lot of this depends on your background, your organization, and your budget. But the simplicity of integrating these systems definitely makes it easier to deal with these tools. You no longer are making decisions about what format to store data lake files in. You’re not debating what settings to use on your Databricks cluster to make sure it’s optimized for your solution. You’re not tuning your warehouse to make sure it even works a little bit. You’re also not provisioning a bunch of different Azure resources to manage – you provision a Fabric environment, and everything you do will be within that Fabric capacity.
Microsoft has essentially taken these products and glued them together applying reasonable default values. This should allow you to get started quickly and start seeing value.
How Do I Start?
You can get started with a Fabric trial here: Fabric trial – Microsoft Fabric | Microsoft Learn
Do note that it’s not an inexpensive technology though. The smallest capacity is currently priced at $250-$300/month – check your pricing here: Microsoft Fabric – Pricing | Microsoft Azure. It’s also unclear if the smallest capacity would fit many needs. You’ll have to fire it up and try it out for a while and see how your usage impacts your capacity.
Learning
Have you visited the Fabric Career Hub yet? Microsoft has put together a lot of different resources (some prebuilt, some live) that will help walk you through this new technology. As of the time of this writing, there are some offers available to get a free certification exam as well. Visit the Fabric Career Hub to learn more!
Conclusion
All of this is very cool technology, and I can’t wait to get my hands dirty with it. I think there will still be questions about when to use Fabric versus other technologies for Data Engineering, but it seems this is the direction Microsoft is pointing us today.