You Really Don't Want to Know

GDPR conform storage of guest information in COVID-19 times

Daniel Kreuer

daniel.kreuer@juit.de

David Weichert

david.weichert@juit.de

PHP Usergroup Frankfurt

Online
January 21st, 2021

Who?

Daniel Kreuer

Daniel Kreuer

  • Software Developer & Architect
  • Agile Principles
  • DevOps from start to finish
  • 20+ years experience in PHP

  • Co-Organizer of PHP Usergroup Frankfurt
  • working for JUIT GmbH since May 2019

David Weichert

David Weichert


  • Software Developer
  • Agile ❦ Legacy ❦ History
  • Programming professionally since 1996
  • PHP since 2006

  • working for JUIT GmbH since May 2019


What?

JUIT GmbH

  • Small Team of Software Developers, all remote
  • PHP, JavaScript, TypeScript
  • Symfony, Vue, React APIs, Typo3, SuluCMS
  • Infrastructure CI/CD, Trainings, EventStorming, Consulting

What?

JUIT GmbH

  • Small Team of Software Developers, all remote
  • PHP, JavaScript, TypeScript
  • Symfony, Vue, React APIs, Typo3, SuluCMS
  • Infrastructure CI/CD, Trainings, EventStorming, Consulting

Miss Racoon

  • Cooperation with cookies + friends, agency based in Bad Homburg
  • Strong focus on data privacy, resiliency, and simplicity
  • Used in gastronomy/catering, cabaret and pop-up events

Data Privacy: Principles

  • Data Reduction and Data Economy

    Only data required and absolutely necessary to fulfill the intended purpose must be processed.
  • Data Security

    The processed data must be protected at any stage whether in transit or rest so they may not be diverted from their intended use.
  • GDPR

    Data processing is subject to laws, and all rules and regulations must be enforced where ever possible.

Data Privacy: Fails

  • Fail: Foratable (by Lunchgate)

    Scan QR-Code, enter private data, confirmation page had auto-incrementing id.

    Data was stored longer than said 14 days (in backups)

  • Mitigation:

    Personal data is encrypted in the client device and then sent to server. Can only be decrypted with private key (not stored on server).

    Data is automatically deleted from database via cron-job, backup retention is 7 days, Customers are informed that data is stored 21 days (14 in database, 7 in backups).

Source: https://www.golem.de/news/datenleck-corona-kontaktliste-ungeschuetzt-im-internet-abrufbar-2007-149492.html

Data Privacy: Fails II

  • Fail: Gatronovi

    Scan QR-Code, enter private data, data is stored unencrypted.

    "Our customers have the data sovereignty, they are responsible for deleting the data."

  • Mitigation:

    Personal data is encrypted in the client device and then sent to server. Can only be decrypted with private key (not stored on server).

    Data is automatically deleted from database via cron-job, backup retention is 7 days, Customers are informed that data is stored 21 days (14 in database, 7 in backups).

    Digital Service storing data has to be secure and provide automatic deletion mechanism. Otherwise sheets of paper per customer, storing them per day in a box and shredding them after 14 days is more secure.

Source: https://www.tagesschau.de/investigativ/ndr/datenleck-restaurants-101.html

Symmetric Encryption: Caesar Cipher

  • Symmetric encryption algorithms use the same key to encrypt and decrpyt a message
  • Julius Caesar used an alphabet shifted by three letters to encrypt military messages (substituting for example A for the letter D), hence ciphers of this type are often referred to as Caesar ciphers or Caesar code
  • Implementation with rotating
              disks
    Source: https://commons.wikimedia.org/wiki/File:CipherDisk2000.jpg (Public Domain)

Symmetric Encryption: ROT13

  • ROT13 is a type of Caesar cipher with using a latin alphabet shifted by 13
  • Because the latin alphabet has 26 letter a ROT13 function is its own inverse
  • PHP has the function str_rot13() implementing the algorithm
  • … or you use the shell command tr:
ROT13 with shell command tr

Symmetric Encryption: Security

  • Although Caesar ciphers (and by implication ROT13) are obviously insecure, symmetric encryption can be secure:
  • Key and Message must have the same length
  • the key must be random
  • the key must never be used more than once
  • This is called a One-Time Pad

Symmetric Encryption: Insecurity

  • As history shows this still often goes wrong:
  • During World War Two the Germans used a machine to eliminate human error when generating random keys
    … only the British at Bletchley Park were able to not only decrypt the messages, but also identify the exact model of the machine used
  • During the Cold War the Soviet Union used One-Time Pads
    …more than once — and the practice of the NSA to archive all intercepted encrpyted communication (especially messages that they could not decipher) paid off
  • In the 1980s the Fleet Broadcast System of the US Navy used the same key on every station…
    …this helped with easier key distribution, but the entire system was breached when the Soviet Union got hold of the keys of the Alameda Naval Air station through bribery
One central problem of symmetric key encryption is the Key Distribution Problem:
  • it is necessary to have the same key on the side of the sender and the recipient
  • getting the key to the other side securely is crucial …
  • …and also often the point of failure

Public/Private-Key Encryption

  • The answer to the key distribution problem was published fairly recently in the 1970s
  • … although it is now known that intelligence agencies seem to have had this knowledge earlier.
  • Public/Private-Key Encryption is also called Asymmetric Encryption, because a pair of keys is used where one key cannot be easily derived from the other

How does it work?

  • A person who wants to communicate securely creates a pair of keys
  • The private key is kept secret, the public key is published, e.g. to the internet
  • A message is encrypted by the sender using the public key of the recipient
  • The recipient uses the private key (that is kept safe from everybody else) to decrypt the message
  • The public key crucially cannot be used to decrypt the message
foo

But, how does it work?

  • One-Way-Functions aka Humpty-Dumpty-Functions, are named thus, because like the breaking of an egg (or Humpty Dumpty falling off the wall) they cannot easily be reversed, i.e. the inverse function is not trivial
  • Alice and Bob agree on a general one-way function:
        7x (mod 11)
    Eve knows this
  • Alice chooses a secret number, known only to her, say 3
    Alice calculates: 73 (mod 11) = 343 (mod 11) = 2
  • Bob chooses a secret number, known only to him, say 6
    Bob calculates: 76 (mod 11) = 117.649 (mod 11) = 4
  • They exchange the result of their calculations
    Eve knows:
    7x (mod 11) = 2; for Alice's x
    7x (mod 11) = 4; for Bob's x
  • Alice calculates 43 (mod 11) = 64 (mod 11) = 9
    Bob calculates 26 (mod 11) = 64 (mod 11) = 9
  • This was discovered by Martin Hellman in 1976.
foo
Denslow's Humpty Dumpty (1904)
Source: https://commons.wikimedia.org/wiki/File:Denslow%27s_Humpty_Dumpty_1904.jpg (Public Domain)

DEMO

DEMO

What's next? - Encrypt a symmetric password asymmetrically

Process

  1. Generate a long symmetric password and a random seed
  2. Encrypt data with symmetric password and seed
  3. Encrypt symmetric password and seed with public key
  4. Store encrypted data and encrypted password/seed

Benefits

  • Every blob of data is encrypted with a one-time password
  • Big data can be encrypted with a symmetric key faster than asymmetrically
  • Data can only be decrypted by possessing private key
  • Symmetric password may be re-encrypted with a different public key, effectively granting access for a different entity.

What's next? - Stateless authentication without username/password

Process

  1. Client sends fingerprint and signed random string with every request
  2. 1st Level of security: Server searches for public key in it's database by fingerprint and returns only data having a connection with the fingerprint
  3. 2nd Level of security: Server validates signature and denies invalid signatures

Benefits

  • Only data available for fingerprint may be sent back to client
  • Even if a valid fingerprint is guessed, data is still encrypted
  • No username/password required, No bearer token or oauth ...
  • TLS network level encryption is not required, MITM proxies still won't see data

Drawbacks

  • Private Key must be present on client machine to decrypt data
  • Mitigation: Create key per machine and grant access, use Hardware Token implementation ...

Questions?