With OneLake, you can get a data lake as a service rather than having to develop one yourself. You've had "OneDrive" for all of your documents for a long time. For all of your data, you currently have "OneLake." OneLake gives your entire company access to a single data lake.
This means that you will always have exactly one OneLake for each Fabric tenant. Never zero or two. No infrastructure needs to be set up or maintained.
One particular advantage of a SaaS service is the idea of a tenant. Through its use, we are able to automatically create a single management and governance boundary for a complete organization that is ultimately under the control of a tenant admin. Any data entering OneLake will automatically participate in out-of-the-box data governance, including data linage, data protection, certification, catalog integration, etc. The admin establishes this initial boundary. A tenant admin is ultimately in charge of all data. Different corporate groups must, however, be able to operate autonomously without involving a central gatekeeper.
Through workplaces, OneLake makes distributed ownership possible. The organization's various departments can operate independently while still contributing to the same data lake thanks to various workspaces. Each workspace is capable of having a separate administrator, access control, region, and billing capacity. Setting up a workstation is pretty simple. It inherits the tenant admin's rules, so there is no need to implement the same governance again or waste time attempting to get various resources to communicate with one another.
You may believe that your company cannot have a single lake since you operate in several different countries and have laws requiring that data be stored there. OneLake addresses this by covering the entire world. Regions can contain a variety of workspaces. This implies that all data in those workspaces will be stored there as well. Azure Data Lake Store gen2 serves as the foundation for OneLake. Under the hood, it might employ a number of storage accounts in several locations, but OneLake will virtualize them into a single logical lake.
Tenants will appear as one big storage account with different workspaces appearing as different containers with data organized into folders. OneLake is compatible with existing ADLS applications by supporting the ADLS Gen2 DFS APIs and SDKs.
Most Fabric data items are prewired to store their data in OneLake using open file formats, and all data in OneLake is included as part of a Fabric data item. Fabric introduces a number of new data items, each with experiences that are customized for certain personas. For instance, a Lakehouse for data engineers and a completely transactional data warehouse for T-SQL developers. For someone used to working with storage today, the lakehouse offers the closest experience to a lake, but it also offers so much more. Whichever item you choose to start with will all keep your data in OneLake similarly to how Word, Excel and PowerPoint saves in OneDrive. You won't find data items and workspaces if you truly look at how this data is kept in OneLake. Similar to what you might see in a data lake today, you will see files and folders. Every workspace will be a folder, as will every piece of data. Any tabular information is kept in delta lake format.
Shortcuts let you connect data across business domains without data movement. Your company may simply transfer data between users and applications with the help of shortcuts, eliminating the need to move and duplicate data. Shortcuts make it possible to mix data from various business groups and domains into a virtual data product to suit a user's particular needs when teams collaborate independently in separate workspaces. A shortcut is a pointer to information kept in different file locations. These file locations can be in OneLake or outside of OneLake in ADLS or S3, within the same workplace or across various workspaces. The reference, regardless of the location, gives the impression that the files and folders are locally stored.
Data can be protected at the workspace or item level. When a user accesses a warehouse through OneLake, for instance, they can either see the entire warehouse or none of it. Once the data is secured you can use wherever and only users with access to all of the data for that warehouse can have direct access to the item in the lake. It is possible to define additional engine-specific security and data in OneLake is secured at the item or workspace level.
OneLake aims to provide you with the most benefit from a single copy of data while preventing data transfer or duplication. You won't need to duplicate data in order to use it with another engine or to dismantle data silos in order to combine it with other data for analysis.
A domain is a means to logically group all the information in a company that is pertinent to a particular region or field. Domain administrators and contributors, who create the domains, can logically organize workspaces together within a domain.
The management barrier that domains establish between a workspace and its tenant allows domain administrators to have more detailed control over a variety of workspaces. Different business groups can now function freely inside the same data lake without worrying about managing various storage resources.
Let's Create a lakehouse
Load data to a lakehouse
Microsoft Fabric is in preview as of today and you can try it out for your organization with 60 days free trial.
Thanks for reading!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.