digital_innovation_machine_learning

Innovation Initiative

Collaborative Machine Learning

Collaborative Machine Learning comes into play when companies do not have enough own training data to achieve the desired accuracy of an ML model.

Motivation

In general, machine learning algorithms require a large amount of training data and companies often do not have enough to achieve the desired accuracy. To address this, a company may be interested in training its model jointly with others while not violating the confidentiality of its own data.

digital_innovation_machine_learning_3

Approach

< class="m-listing-sectors__card-content accordion-card__content" data-accordion-content >

There are various approaches that allow machine learning models to be trained across several data sources without disclosing them. We have identified multi-party computation and federated machine learning as the most promising candidates for privacy preserving training. Additionally, we also take into consideration the protection of the trained model (Differential Privacy).

In order to gain expertise, we are running hands-on analysis and experiments where we focus on a real-life scenario (unbalanced, non-IID data):

  • Secure multi-party computation for linear models
  • Federated training of tree-based models (Gradient Boosted and CART decision trees)
  • Federated training of neuronal networks (parameter server approach)

Expected results

< class="m-listing-sectors__card-content accordion-card__content" data-accordion-content >

The trained model has a much better accuracy due to more data and more features. This applies for the following scenarios where the data cannot be centralized:

  • Company-internal: analysis of distributed, siloed data sources across jurisdictions with stringent privacy (e.g. cross border mortgage default model)
  • Cross-company: extended insights thanks to analysis of combined data from joint customers (e.g. extended features for cross & upselling or consolidates predictive maintenance)
  • Consortia: access to more data & features through secure consortia partnerships between enterprises and regulators (e.g. extended anti money laundering, payment fraud detection, fraud detection for insurance claims)

Status

< class="m-listing-sectors__card-content accordion-card__content" data-accordion-content >

Parallel to our hands-on analysis we are running workshops with clients from different industries in order to identify and sharpen use cases as well as to run proof of values.