Data Models vs Datasets

galang.nugroho
2 min readMar 26, 2021

--

Some content are copied and edited

Data Models

A data model is a representation of the business data of an organization or business segment. Located on top of the physical table. You can use a model as the basis for your story.

There are following types of data models:

  • Acquired: Data is imported (copied) and stored in Cloud. Changes made to the data in the source system don’t affect the imported data.
  • Live or Federated: Data is stored in the source system. It isn’t copied to Cloud, so any changes in the source data are available immediately if no structural changes are brought to the table or SQL view.

Datasets

A dataset is a simple collection of data, usually presented in a table. You can use a dataset as the basis for your story, and as a data source.

Data Models complement datasets or vice versa. Datasets are more suitable for ad-hoc analysis, while data models are more suitable for governed-data use cases.

Models vs. Datasets

Here’s information to help you decide between a model and a dataset.

What’s the difference between a model and a dataset?

When would I use a dataset instead of a data model?

If all you want to do is upload some data, for example in .csv or .xlsx format, and start analyzing it immediately in a story, a dataset will probably be the right choice.

Example: Say you want to change the data type of a field from a Dimension to a Measure. If you’re using a dataset, only the metadata definition of that column would need to be changed, whereas for a model, it would mean deleting the dimension table and updating the fact table to include an additional column, which would be more time-consuming.

When would I use a data model instead of a dataset?

If you prefer to start by defining the data structure, a model will probably be the right choice.

Typical examples of when a data model is suitable are:

  • Connecting to live data.
  • Planning use cases, where the planner already has the structure in mind and would then either input the data or import it from different sources to fit into the model.
  • Governed data that IT owns and wants to share with others.

Models guarantee that the data they hold follows a series of business rules that certify that workflows such as planning can be run. Changes made to the structure of the model can be done either at the structure level, if the fact table is empty, or by rebuilding the model from the original data preparation session.

Models also support row-level security, fine-grained data management of dimensions, and fact tables.

--

--

No responses yet