Aadhaar and signing a blank sheet of paper redux

The Aadhaar abuse that I described a year ago as a hypothetical possibility a year ago has indeed happened in reality. In July 2017, I described the scenario in a blog post as follows:

That is when I realized that the error message that I saw on the employee’s screen was not coming from the Aadhaar system, but from the telecom company’s software. … Let us think about why this is a HUGE problem. Very few people would bother to go through the bodily contortion required to read a screen whose back is turned towards them. An unscrupulous employee could simply get me to authenticate the finger print once again though there was no error and use the second authentication to allot a second SIM card in my name. He could then give me the first SIM card and hand over the second SIM to a terrorist. When that terrorist is finally caught, the SIM that he was using would be traced back to me and my life would be utterly and completely ruined.

Last week, the newspapers carried a PTI report about a case going on in the Delhi High Court about exactly this vulnerability:

The Delhi High Court on Thursday suggested incorporating recommendations, like using OTP authentication instead of biometric, given by two amicus curiae to plug a ‘loophole’ in the Aadhaar verification system that had been misused by a mobile shop owner to issue fresh SIM cards in the name of unwary customers for use in fraudulent activities. The shop owner, during Aadhaar verification of a SIM, used to make the customer give his thumb impression twice by saying it was not properly obtained the first time and the second round of authentication was then used to issue a fresh connection which was handed over to some third party, the high court had earlier noted while initiating a PIL on the issue.

This vindicates what I wrote last year:

Using Aadhaar (India’s biometric authentication system) to verify a person’s identity is relatively secure, but using it to authenticate a transaction is extremely problematic. Every other form of authentication is bound to a specific transaction: I sign a document, I put my thumb impression to a document, I digitally sign a document (or message as the cryptographers prefer to call it). In Aadhaar, I put my thumb (or other finger) on a finger print reading device, and not on the document that I am authenticating. How can anybody establish what I intended to authenticate, and what the service provider intended me to authenticate? Aadhaar authentication ignores the fundamental tenet of authentication that a transaction authentication must be inseparably bound to the document or transaction that it is authenticating. Therefore using Aadhaar to authenticate a transaction is like signing a blank sheet of paper on which the other party can write whatever it wants.

Advertisements

In the sister blog during December 2017 and January 2018

The following posts appeared on the sister blog (on Financial Markets and their Regulation) during December 2017 and January 2018.

Turning an Android phone into a travelling desktop

Installing the software in the phone

I covered this in a post a few months ago:

You can turn your phone into a miniature version of your laptop by installing a desktop Linux distribution inside your Android phone and then installing all your favourite open source software inside that.

In my case, the open source software running inside my phone includes:

This solution works quite well provided it is used sparingly (for example to make a small last minute change to a presentation). However, as one gets used to the power lurking inside the phone, one is tempted to do this more extensively, and the limitations of the phone’s tiny screen and clumsy virtual keyboard become very apparent. In this post, I talk about my attempts to overcome these limitations with the help of other gadgets and peripherals.

Turning the hotel TV into an external display

I find an external display to be a more pressing need than anything else – it is useful whether one is consuming content (for example, reading a pdf file with graphs, diagrams and equations) or creating content (for example, writing this blog post). The obvious solution to the tiny screen problem is to connect the phone to the large flat screen TV that is now present in virtually every hotel room today. But implementing this idea proved non trivial.

Many modern Android phones do not support the MHL or Slimport interfaces and so cannot provide an HDMI output from the USB port. However, almost all Android phones support casting to a TV using Google Chromecast, and so this was the solution that I adopted. Chromecast however has two serious limitations:

  • It needs an internet connection even when casting local content from the phone.
  • It does not connect to the portal based WiFi that is standard in most hotels (it does connect to standard password based WiFi networks used by home routers).

So the Chromecast needs too be supplemented by a portable WiFi router. I use the HooToo TripMate Nano which can act as a WiFi bridge that connects to one WiFi network (say the hotel WiFi) and makes that internet connection available over its own WiFi network. In a hotel room, I first power up the HooToo, connect my phone to the HooToo WiFi network, login to the HooToo admin page and ask it to connect to the hotel WiFi network. The hotel’s login portal then comes up on my phone web browser and I sign it to it. Next, I connect my Chromecast to the HDMI port of the hotel TV and power it up. My Chromecast has been permanently set up to connect to the HooToo WiFi network and so it does so automatically. Now the phone and the Chromecast are connected to the same WiFi network (the HooToo WiFi) which in turn is connected to the internet through the hotel WiFi. The Chromecast now works perfectly, and I ask my phone to mirror/cast its screen to the Chromecast. Now, my phone has a 42 inch (or bigger) display on which I can read anything that is on the phone.

Both the Chromecast and the HooToo need power and I find it convenient to supply this power from a powerbank that has two charging ports. I carry a power bank anyway as an extra power supply for my phone, and by using it I avoid carrying too many chargers/adaptors and hunting for power points (sockets) in the hotel. (When I am travelling outside the country, I carry only one adapter plug and so even if the hotel has lots of power sockets, I may have access to only one because my plugs do not fit these sockets without an adaptor). This whole set (Chromecast, HooToo and power bank) is quite light and compact, and I have gotten used to carrying the set with me whenever I travel.

External keyboard and mouse

Occasionally, I find that the external display is not enough. There are some trips during which I plan to do extensive typing on my phone, and then an external bluetooth keyboard and mouse become useful. Since they are bluetooth devices, they can be used with a wide range of phones, tablets and laptops, and not just an Android phone. They end up being used at home with one device or the other, but these are much bulkier peripherals and I carry them with me during my travel only when I anticipate heavy use. On these occasions (as in the photograph below), my mobile is effectively a desktop with a large screen, comfortable keyboard and mouse.

My phone connected to hotel TV and other peripherals

Why Intel investors should subscribe to the Linux Kernel Mailing List or at least LWN

On January 3 and 4, 2018 (Wednesday and Thursday), the Intel stock price dropped by about 5% amidst massive trading volumes after The Register revealed a major security vulnerability in Intel chips on Tuesday evening (the Meltdown and Spectre bugs were officially disclosed shortly thereafter). But a bombshell had landed on the Linux Kernel on Saturday, and a careful reader would have been able to short the stock when the market opened on Tuesday (after the extended weekend). So, -1 for semi-strong form market efficiency.

Saturday’s post on LWN was very cryptic:

Linus has merged the kernel page-table isolation patch set into the mainline just ahead of the 4.15-rc6 release. This is a fundamental change that was added quite late in the development cycle; it seems a fair guess that 4.15 will have to go to -rc8, at least, before it’s ready for release.

The reason this was a bombshell is that rc6 (release candidate 6) is very late in the release cycle where only minor bug fixes are usually made before release as version 4.15. As little as 10 days earlier, an article on LWN stated that Kernel Page-Table Isolation (KPTI) patch would be merged only into version 4.16 and even that was regarded as rushed. The article stated that many of the core kernel developers have clearly put a lot of time into this work and concluded that:

KPTI, in other words, has all the markings of a security patch being readied under pressure from a deadline.

If merging into 4.16 looked like racing against a deadline, pushing it into 4.15 clearly indicated an emergency. The public still did not know what the bug was that KPTI was guarding against, because security researchers follow a policy of responsible disclosure where public disclosure is delayed during an embargo period which gives time to the key developers (who are informed in advance) to patch their software. But, clearly the bug must be really scary for the core developers to merge the patch into the kernel in such a tearing hurry.

One more critical piece of information had landed on LWN two days before the bombshell. On December 27, a post described a small change that had been made in the KPTI patch:

AMD processors are not subject to the types of attacks that the kernel page table isolation feature protects against. The AMD microarchitecture does not allow memory references, including speculative references, that access higher privileged data when running in a lesser privileged mode when that access would result in a page fault.

Disable page table isolation by default on AMD processors by not setting the X86_BUG_CPU_INSECURE feature, which controls whether X86_FEATURE_PTI is set.

As Linus Torvalds put it a few days later: “not all CPU’s are crap.” Since it was already known that KPTI would degrade the performance of the processor by about 5%, the implication was clear: Intel chips would slow down by 5% relative to AMD after KPTI. In fact, one post on LWN on Monday evening (Note that Jan 2, 2018 0:00 UTC (Tue) would actually be late Monday evening in New York) did mention that trade idea:

Posted Jan 2, 2018 0:00 UTC (Tue) by Felix_the_Mac (guest, #32242)
In reply to: Kernel page-table isolation merged by GhePeU
Parent article: Kernel page-table isolation merged
I guess now would be a good time to buy AMD stock

The stock price chart shows that AMD did start rising on Tuesday, though the big volumes came only on Wednesday and Thursday. The interesting question is why was the smart money not reading the Linux Kernel Mailing List or at least LWN and getting ready for the short Intel, long AMD trade? Were they still recovering from the hangover of the New Year party?

Peripheral vision and non Euclidean Geometry

I came across a recent paper by Google researchers on Introducing a New Foveation Pipeline for Virtual/Mixed Reality

In the human visual system, the fovea centralis allows us to see at high-fidelity in the center of our vision, allowing our brain to pay less attention to things in our peripheral vision. Foveated rendering takes advantage of this characteristic to improve the performance of the rendering engine by reducing the spatial or bit-depth resolution of objects in our peripheral vision. To make this work, the location of the High Acuity (HA) region needs to be updated with eye-tracking to align with eye saccades, which preserves the perception of a constant high-resolution across the field of view.

This reminded me of a paper (“Computer graphics, peripheral vision and non-Euclidian geometry.” Computers & Graphics 16.3 (1992): 253-258) that I wrote 25 years ago which was also based on the distinction between foveal and peripheral vision. That paper was not about virtual reality, but about small computer screens.

Computer graphics is often confronted with the task of providing the viewer with a visual picture of some object which is too large to fit on a computer screen unless the image is scaled down so drastically that much of the detail is lost. The viewer is then asked to work with a partial view of the object, and use a keyboard or a mouse to (a) scroll this image horizontally or vertically, or (b) zoom in or out, or (c) rotate the object.

The computer screen … uses clipping to implement what one might call a “cookie cutter” vision – a small portion of the “cookie” is neatly cut out and given to us. The screen is treated as a window to the “world” – everything visible from this window is displayed at the same resolution, and what is outside is simply cut out.

In the human eye, we find a gradual loss of visual clarity as we move away from the fovea to the periphery; we do not find an abrupt loss of vision at some point. … while concentrating on a small part of the field of vision [the human eye] still retains a hazy view of the peripheral region preventing it from losing sight of the total picture.

This paper argues that the lack of a similar peripheral vision is a major deficiency in computer graphics today. It then goes on to develop a mapping technique which tries to simulate this peripheral vision, and thereby make computer graphics more powerful and versatile. … The suggested mapping is closely related to non Euclidian geometry …

This to my mind, is a very important insight because experimental psychologists established over fifty years ago that perceptual geometry of human vision is in fact strongly non Euclidian – specifically hyperbolic [see, for example, Blank, A.A. “The Luneberg Theory of Binocular Space Perception”, in Koch. S., (ed), Psychology : A Study of a Science, Vol 1, New York, McGraw Hill (1959).] This experimental evidence is at first quite surprising and inexplicable.

Living as we do in a Euclidian world (the relativistic non Euclidian nature of the world is negligible for our purposes), why do we have non Euclidian vision and how do find our way about in the world? Peripheral vision suggests an answer to both questions. We find our way about because for that we rely on our foveal vision which is Euclidian (hyperbolic geometry is locally Euclidian); we never trust our peripheral (non Euclidian) geometry for that. We have hyperbolic vision in a Euclidian world because that is the way to accommodate peripheral vision which is more important for human survival than the niceties of Euclidian geometry.

Computing power has progressed far enough for these variable scaling techniques to be done in real time for videos (and not just still images that I had in mind a quarter century ago). I wish these techniques come into widespread use. Whenever I am navigating using Google maps, I find it frustrating that if I zoom in to see the turnings and intersections, I lose the big picture of where I am in the overall route. Non Euclidean mappings would allow me to zoom in into an intersection while still seeing the big picture (hazily).

In the sister blog during August-November 2017

The following posts appeared on the sister blog (on Financial Markets and their Regulation) during August-November 2017.

Building credit bureaus that have no personal information

In two blog posts (here and here), I have argued that in an era of widespread hacking, the credit bureau’s business model is unsustainable because it requires storing enormous amounts of confidential information on tens of millions of individuals who are not even its customers.

However, these bureaus serve a useful function of aggregating information about an individual from multiple sources and condensing all this information into a credit score that measures the credit worthiness of the individual, An individual has credit relationships with many banks and other agencies. He might have a credit card from one bank, a car loan from another bank and a home loan from a third; he may have overdue payments on one or more of these loans. He might also have an unpaid utility bill. When he applies for a new loan from a yet another bank, the new bank would like to have all this information before deciding on granting the loan, but it is obviously impractical to write to every bank in the country to seek this information. It is far easier for all banks to provide information about all their customers to a central credit bureau which consolidates all this information into a composite credit score which can be accessed by any bank while granting a new loan.

The problem is that though this model is very efficient, it creates a single point of failure – a single entity that knows too much information about too many individuals. What is worse, these individuals are not customers of the bureau and cannot stop doing business with it if they do not like the privacy and security practices of the bureau.

We need to find ways to let the bureaus perform their credit scoring function without receiving storing confidential information at all. The tool required to do this (homomorphic encryption) has been available for over a decade now, but has been under utilized in finance as I discussed in a blog post two years ago.

Suppose there is only one bank

To explain how a secure credit bureau can be built, I begin with a simple example where the bureau obtains information only from one bank (or other agency) which has the individual as a customer. I will then extend this to multiple banks.

  • The credit score of an individual can be approximated by a linear function (weighted sum) of a bunch of attributes relating to the individual:

    score = w1 x1 + w2 x2 + … + wn xn

    where wi is a weight (coefficient) and xi is an attribute (for example, xi could indicate whether the individual is delinquent on a car loan and x2 could represent the credit card debt outstanding as a percentage of the credit limit). Since xi could be a non linear function (for example, the square or logarithm) of the underlying variable, the linear form is not really restrictive.

  • The attributes xi are known only to the bank. These are never revealed to the bureau which sees only the weighted sum above.

  • The weights wi are proprietary information that needs to be known only to the credit bureau. The bureau encrypts the weights and sends the encrypted weights to the bank.

  • Homomorphic encryption allows the bank to compute the weighted sum

    score = w1 x1 + w2 x2 + … + wn xn

    without decrypting the weights. Actually, the bank does not see the weighted sum (the score). What it computes using homomorphic encryption is the encrypted weighted sum, but the credit bureau can decrpyt this and obtain the score. Since the xi are known to the bank, the computation of this scalar product requires only Additive or Partial Homomorphic Encryption (AHE or PHE) which is much more efficient than Full Homomorphic Encryption (FHE). The GLLM method (Goethals et al. “On private scalar product computation for privacy-preserving data mining.” ICISC. Vol. 3506. 2004.) based on the Paillier AHE can do the job.

  • At the end therefore:

    1. The credit bureau knows the credit score of the individual.

    2. The credit bureau has not revealed either its scoring rule or the credit score of the individual.

    3. The bank has not revealed any confidential information about the customer to the credit bureau other than the credit score. (Note for the geeks: The privacy guarantee here is at the highest possible level – it is information theoretical (Theorem 1 of Goethals et al.) and not merely cryptographic. Even in the implausible worst case scenario where the cryptography is somehow broken, that would leak information from the credit bureau to the banks but not in the other direction.)

  • The above procedure is repeated for each individual. The wi would be the same for all individuals, but xi would of course vary from individual to individual. To be precise, we should write the i’th attribute of the k’th individual as xki.

  • If the credit bureau is hacked, confidential information belonging to the individuals is not exposed because the bureau does not have this at all. The credit scores and the scoring rule may be exposed, but this is a loss primarily to the credit bureau and there are no negative externalities involved.

Extension to Multiple Banks

In general, the credit bureau will need information from many (say m) banks (or other agencies).

  • The credit score of an individual can be represented as a weighted sum of sub scores from various banks (the bureau may or may not use equal weights ui = 1 or ui = 1/m for this purpose):

    Total Score = u1 subscore1 + u2 subscore2 + … + um subscorem

    where the uj is the weight of bank j and subscorej is the sub score computed using information only from bank j as follows:

    subscorej = w1 xj1 + w2 xj2 + … + wn xjn

    where xji is the i’th attribute of the individual at bank j.

  • Bank j can use homomorphic encryption to compute uj subscorej. We first define a set of modified weights vji for attribute i for bank j as:

    vji = uj wi

    and then let the bank compute a weighted sum exactly as in the one bank case but using weights vji instead of wi:

    uj subscorej = vj1 xj1 + vj2 xj2 + … + vjn xjn

  • The credit bureau adds up all the uj subscorej that it receives from various banks to find the credit score of the individual.

  • We can however get one further level of privacy in this case where the credit bureau is able to compute the total score of an individual without learning any of the subscorej. If this extra privacy is desired, we modify the procedure as follows:

    1. Bank j computes

      disguised_subscorej = uj subscorej + rj

      where rj is a random number chosen by bank j. The bank communicates the disguised_subscore to the credit bureau. (Note for the geeks: Actually since the bank computes and communicates an encrypted form of this quantity homomorphically, it needs to encrypt rj also. This is possible since we are using public key cryptography – the public key of the credit bureau is publicly available and anybody can encrypt using this key; but only the bureau can perform decrpytion because only it has the private key).

    2. All the banks collectively compute the sum of all the rj using secure multi party computation based on secret sharing methods which ensure that no bank learns the rj of any other bank. The sum of all the rj (let us call it sum_r) is communicated to the credit bureau.

    3. The credit bureau computes the sum of all the disguised_subscorej. From this result, it subtracts sum_r to get the correct total credit score.

  • At the end therefore:

    1. The credit bureau knows the total credit score of the individual.
    2. The credit bureau has not revealed either its scoring rule or the credit score of the individual.

    3. The bank has not revealed any confidential information about the customer to the credit bureau: not even the sub score based on data in its possession.

  • The above procedure is repeated for each individual. The modified weights vji would be the same for all individuals at the same bank, but xji would of course vary from individual to individual. To be precise, we should write the i’th attribute of the k’th individual at the j’th bank as xjki. The rj (and therefore sum_r) should also ideally vary from individual to individual: strictly speaking, these are actually rkj and sum_rk for individual k. Similarly, disguised_subscorej should strictly speaking be disguised_subscorekj

Allowing the individual to verify all computations

How does an individual detect any errors in the credit score? How does an external auditor verify the computations for a sample of individuals?

The individual k would be entitled to receive a credit report from the credit bureau that includes (a) the unencrypted total credit score (total_scorek), (b) the encrypted disguised_subscorekj for all j, (c) the encrypted modified weights vji for all i and j and (d) sum_rk. Actually, (b), (c) and (d) should be publicly revealed by the credit bureau on its website because they do not leak any information.

The individual k would also be entitled to get two pieces of information from bank j: (a) the attributes xjki for all i and (b) the random number rkj.

With this information, the individual k can verify the computation of the encrypted disguised_subscorekj for all j (using the same homomorphic encryption method used by the banks). The individual can also verify sum_rk by adding up the rkj. Using the public key of the credit bureau, the individual can also encrypt total_scorek – sum_rk and compare this with the encrypted sum obtained by adding up all the disguised_subscorekj homomorphically.

The same procedure would allow an auditor to verify the computation for any sample of individuals.

The careful reader might wonder how the individual can detect an attempt by a bank to falsify rkj. In that case, sum_rk will not match the sum obtained by adding up the rkj, but how can the individual determine which bank is at fault? To alleviate this problem, each bank j would be required to construct a Merkle tree of the rkj (for all k) and publicly reveal the root hash of this Merkle tree. Individual k would then also be entitled to receive a path of hashes in the Merkle tree leading up to rkj. It is then impossible to falsify any of the rkj without falsifying the entire Merkle tree. Any reasonable audit procedure would detect a falsification of the entire Merkle tree. Depending on the setup, the auditor might also be able to audit (a sample of) the secure multi party computation of rkj directly by verifying a (sub) sample of the secret shares.

Conclusion

At the end, we would have built a secure credit bureau. A Equifax scale hacking of such a bureau would be of no concern to the public; it would be a loss only for the bureau itself. Mathematics gives us the tools required to do this. The question is whether we have the good sense and the will to use these tools. The principal obstacle might be that the credit bureau would have to earn its entire income by selling credit scores; it would not be able to sell personal information about the individual because it does not have that information. But this is a feature and not a bug.