EA case study: Example of a data product

What is a data product? 

 The concept is still a bit fussy, and we will use the EA case study to create three different examples of data products. For this article, I asked my colleagues Anders Friis & Daniel Hellerstedt for help.

 "A data product is a reusable data asset, built to deliver a trusted dataset, for a specific purpose. It collects data from relevant data sources — including raw data — processes it, ensures data quality, and makes it accessible and understandable to anyone who needs it to meet specific needs."

Examples 

We have previously talked about business events that triggers some process flows.

One of them is related to the need to report information about individuals according to GDPR. Today, this is a manual procedure but with a larger number of actors in our productions, there is a need for automation, thus the hypothesis to use data products. 

Yamdu: General information about an actor.

The second example relates to generic financial KPI’s from APQC. 

  • Cost to perform the process " invoice customer" per invoice processed

  • Personel cost to perform the process " process customer credit" per $1000 reveune

The third example are KPI’s more targeted towards film production budgeting and productivity.

  • Scheduled pages of manuscript per day of filming and size of crew and cast

  • Hours of pre-production, production, and post-production per minute of ready film and budget size

Information sources

The main source systems for crew and actors is Yamdu, but the service doesn't have an API. This platform is also the source of the bulk of the information related to film productions. Instead you can export and import csv-files about productions, schedules, actors, crew etc. 

Yamdu is the second production system we have, and before that, we used StudioBinder. The interface for data product should then be system agnostic, so if we change systems in the future, the interface should be the same.

As we have several sources for personal data, our approach for solving this must take this into consideration. Future expansion should include both Servera that is our CRM-system and Visma that is used for both finance and time reporting. Including Microsoft Office 365, we have four major platforms, plus a huge number of specialized applications for film production. 

What I as a user would like to have as an MVP, is a service that delivers the productions an individual have participated in, and in what roles. E.g. a very simple data product that contains both information about an individual and the data related to the production, thus more than master data. The data product should then be able to be extended to other types of personal information.

The information for productions can have very high confidentially and be very sensitive to individuals. Therefore, security and privacy are very important.

Design considerations

We are using Azure AD & Office365 and have a cloud first strategy, therefore the logical platform for our data products is Microsoft Azure. 

As we need to use batches to retrieve information about individuals from both Yamdu, Visma and Servera, and the individuals not necessarily are present in Azure AD, we need to store the information somewhere else. Same principle goes for other types of information.

The question is if we should go for a master data approach where we have all master data for parties, (customer, actors, crew, suppliers & partners) in one place, production related information somewhere else, and use the data mesh to combine, or use one big datastore for everything.

Information about individuals is present in several business capabilities, e.g. Sales, Finance, HR and Film production, but information about film production is limited to the latter capability. 

This is why my recommendation is to create a solution based on data mesh, even if our sources from the beginning only are in one capability.

When looking at open standards for information modeling and API’s, we select TM Forums models and Open API’s as we are part of Telecom, Media and Entertainment sector.  

Our business processes are mainly based on APQC for broadcasting, with some improvements in sales and production processes to better fit with our business model.

Next, MVP for a data product.