Over the last couple years security and privacy on the internet has become significant issue due to constant hacking and spying from groups and individuals both foreign and domestic. A common solution to such intrusions is cryptography, one common encryption algorithm is RSA encryption. RSA is a public-key encryption algorithm, the core security of the algorithm comes from the assumed difficulty of finding the prime factorization of a composite number consisting of two large prime factors. So the practical uses of the algorithm depend on the ability to generate large prime numbers that are known only to the individual who is setting up the encryption.
After studying the RSA algorithm I became interested in the process of generating prime numbers for use in cryptography and the statistics involved in the process. As an exercise in using the algorithms we will attempt to generate a large prime number with the appropriate properties so that it is improbable that the same prime number would be generated again.
The problem with prime numbers is that usually the algorithms used to prove that a number is prime runs slowly, an alternative is to use a probabilistic algorithm that checks if the number is composite. If a number repetitively fails the composite test we can determine the probability that it is prime and it is considered to pass the equivalent primality test. So rather than proving a number to be prime we can show that there is only a small probability that the number could pass this new primality test.
For example the AKS primality test is the first deterministic primality test to be proven to run in polynomial time. The time complexity of the original version of this algorithm as it was when it published in 2002 is , where is the number being tested for primality. The notation is defined as for some arbitrary . An updated version of the AKS algorithm has a time complexity of . While alternatively one of the most commonly used probabilistic primality tests, the Miller-Rabin test, has a time complexity of , where is the number of times the algorithm is applied to a number in order to reach the required probability of n being prime. When comparing the time complexity of these two algorithms we see that for large prime numbers even when is large the Miller-Rabin test will tend to be faster. So rather than trying to find a large prime number by proving that is prime, we can use a probabilistic primality test with the parameter set such that the probability that is composite is small.
Our target is to generate a prime number that is unlikely to be generated again, this is achieved by finding a number that is large enough that the probability of selecting it specifically is small and showing that it is probably prime with the primality test. Since the primality is probabilistic we should set up the test such that the probability that the number is not prime is also small. But how small should these probabilities be?
We can consider the probability of some event occurring given the number of opportunities it has to occur to be , where is the probability of a particular event, and is the number of times that the event could occur. So if is the probability of selecting a random number , and is the number of times a random number is selected, then is the probability that will be selected after trials. In order to calculate the required size of we still need to know how small should be and the size of . The value of can be set to any probability but for this exercise we will use , so given tries it is as likely as not that an event with probability will occur.
Now to estimate . If we let be the maximum number of computations that a highly efficient parallel computation device could perform over its existence. Lets assume that no more than one particle is necessary to perform a calculation. Then we can use; the plank time as the minimum time for one particle to perform one computation, the approximate time till the heat death of the universe as the maximum time that the machine has available for calculations, and the total mass-energy of the observable universe in the form of the lightest particle from the standard model as the maximum number of particles available for computation. Then , where the number of electron-neutrinos is calculated from the energy of the electron-neutrino and the mass-energy of the observable universe such that . Using these estimations .
Now that we have we can calculate our small probability, . Given the assumptions and values used in the calculation of , any event that occurs with a probability smaller than is as likely as not to never occur.
Generating the Prime Number
Using the Miller-Rabin primality test, the probability that a composite number passes a single iteration of the test is , so the probability that it passes times is . Now rearranging and calculating for the number of times the test should be run on a number so that the probability that it is composite is the small probability . . So the each number should be tested times before it is considered prime.
Finally the probability of randomly selecting the prime number should be the small probability . If the number was merely a random number with a value smaller than or equal to then the probability of selecting it would be . But the number is considered to be a prime so all of the composite numbers can be ignored and the probability would then become , where is the number of prime numbers smaller than . According to the prime number theorem the number of primes smaller than the integer is , so the probability of choosing a particular random prime number smaller than is . So we need to solve the equation . This function can be solved numerically using the Newton-Raphson method.
Once we solve for the number we will be able to use that to calculate a prime number with the Miller-Rabin primality test such that the probability of the Miller-Rabin test failing is smaller than or equal to . And such that the probability of the number being randomly selected by any random process is smaller than or equal to .