Generating_Better_Random_Numbers
Random numbers are used for many things on a computer. The most important of these is generating random numbers for cryptographic algorithms used in programs such as SSH, GPG, SSL, etc... The quality or strength of your encryption depends very strongly on the "randomness" of the numbers provided by your random number generator (RNG).
Linux has a builtin RNG, that is sufficient for many simple applications, but for applications requiring higher security, something more is needed. There are several ways that you can improve upon the numbers generated by default Linux RNG. We will discuss several of these methods here.
Contents |
Background
Here we will briefly cover some necessary background in cryptography. This is a huge field, and could not possibly be covered in it's entirety here. There are some great resources at the bottom of this section if you wish to read more (you should).
Terminology
Randomness & Entropy
A good place to start discussing random number generators, is to define what it means for a number to be random. This is not as simple as it seems, and "randomness" is something that few people really understand (although many think they do).
The technical definitions of randomness get into some pretty hairy math (much of which I don't understand), and will require quite a bit of study to really understand.
There are a few basic things that most people agree on however:
- Random numbers are unpredictable -- that is in a series of truly random numbers, there is no way for you to know what the next number will be. There is no discernable pattern.
- Random numbers are not compressible, i.e. there is no shorter way to describe the number other than the number itself.
- If you have a set of numbers from which you select numbers at "random", then each number is equally likely to be selected. (This is really the same as the first...)
Entropy is technically defined as the amount of disorder or uncertainty in a system. Entropy is often mistakenly said to be the amount of "information" present in a message. However, it is actually exactly the opposite! Knowledge (information) is not uncertainty. You have to lose entropy (uncertainty) to gain information (certainty). Grokking this will make understanding the mechanism behind the Linux RNG much easier.
Think of it this way: you basically have a "pool" (a file) of numbers that are derived from all sorts of events that are unpredictable (more accurately -- not too predictable), such as keystroke timings, fluctuations in clock frequencies, static from your sound card, and more. You have no idea what these numbers are -- you are most likely unaware of how fast your hard drive read head is moving, and probably aren't keeping track of the time that lapses between each key you press on the keyboard.
This pool of numbers is an "entropy pool" -- i.e. a pool of information that you are uncertain about. You pull numbers out of there, which depletes the entropy, and converts it into information (because now that you can see the number -- you are no longer uncertain about it). Thus you drain entropy to gain information -- and the information gained is unpredictable until you have extracted it and look at it. That is, the information you have is random.
Further Reading:
Truly Random vs. Pseudorandom
Computers are very deterministic, meaning that they are very predictable -- given the current state of your computer, it is very easy to determine what it's future state will be. This is great most of the time -- since we all like our computers behaving the same each time we use them. But it makes generating "random" numbers very tricky. In order for a number to be random, it can't be predictable -- so how do you get use the predictable state of your computer to create and unpredictable number?
Most of the time, when you are supposedly getting "random" numbers on your computer, you are actually getting pseudorandom numbers instead.
Hardware & Software RNGs
Acceptable Entropy Sources
(list ~ amount of entropy that each can provide, limitations of each)
- Keystroke timings
- Caveats about finger placement and tendency for timing to be predictable as result of (show diagram)
- Mouse Movements
- mouse-coordinates will give less randomness than you might expect. the problem is that with networked applications like web browsers, the application traffic between the client and server effectively publish the locations and sequence of the client's mouse-events. thus, the only private noise that's left is the position of the cursor _within_ the button that gets pressed. since screen buttons vary in size from a few hundred to a few thousand pixels, you're really getting only 8 to 12 bits of variability per event, instead of the 20 or so that you might expect. [1][2]
- Interrupts
- Disk seek times
- Radioactive Decay
- Quantum Noise
Bad Entropy Sources
- Time (distinguish '"date" time', from variations in clock frequencies)
- only 31,556,926 seconds per year...VERY small number of possible values for a computer to test
- Values of keystrokes as opposed to keystroke timings
- Data generated might contain info about what the user is typing that would be useful to an attacker.
- English language contains low entropy (~1.3 bits/character)
- Public sources like random.org, etc
- This violates the unknowable and unpredictable part of the definition of randomness. If an attacker finds out that you get your numbers from a public source, then your "random" numbers are completely useless.
- Selections from natural language texts (no project gutenberg....)
- English language = low entropy....
- See public sources above
- Network Devices
- Can be easily influenced by attacker
Further Reading
- Using & Creating Cryptographic-Quality Random Numbers
- RFC 4086: Randomness Requirements for Security
- sci.crypt Newsgroup
- Why Cryptography Is Harder Than It Looks
- Randomness For Crypto
Understanding The Linux RNG
- Entropy sources used
- random.c
- /dev/random & /dev/urandom
- Primary / Secondary pools
- Pool size / entropy_avail
Configuring the Kernel
- Selecting the right hardware RNG (if applicable)
- char devices
- crypto--->via padlock
- crypto algorithms
Installing & Configuring the necessary software
There are several tools that can be used to gather entropy from hardware devices. Many of these are especially useful if you do not have a hardware RNG (how do I link to HW RNG section below???).
rng-tools
The rng-tools program is part of the "gkernel" package, found at http://sourceforge.net/projects/gkernel/
If you have a hardware RNG (discussed below), there is a program in here for you called rngd which will pull random bits from your the RNG into the /dev/random pool
audio-entropyd
- Sound card support must be built into kernel or module must be loaded
- Does not depend on sound being played -- uses static from device
To install the audio entropy daemon:
emerge audio-entropyd /etc/init.d/audio-entropyd start rc-update add audio-entropyd default
Clock Randomness Gathering Daemon
Gathers entropy from tiny variations between high-frequency hardware clocks...
To install:
emerge clrngd /etc/init.d/clrngd start rc-update add clrngd default
video-entropyd
- Needs v4l enabled in kernel
- Need to have a video device hooked up and pointed at a random source
To install:
emerge video-entropyd
video-entropyd doesn't have a "daemon" mode, so you will have to make a cron-job to have it run every view minutes.
(show them how to do this)
HAVEGE
Hardware RNGs
For anyone who is interested in doing anything serious, a PRNG will not be sufficient. It is way too slow, and the numbers are not of the same quality as a hardware RNG.
There are a few ways that you might come into possession of such a device.
Built-In To Your Motherboard
The lucky amongst you will have a RNG built into your motherboard:
- List of such motherboards and documentation for them
- Intel i8xx variants
- VIA mobos since ~2003 (get list of common varieties)
Purchasing One
Those of you who are wealthy, can spend hundreds of dollars to buy a commercial hardware RNG:
- Here is a brief list of such devices (there are hundreds of them out there -- just look on Google):
However, this is not the recommended approach for several reasons:
- First of all, it's a waste of money. With a bit of effort, you can build one yourself for a fraction of the cost. These things can run into the hundreds of dollars, where you can build your own for less than $50. Plus you get bragging rights -- you get to tell your friends that you have a RNG that uses radioactive decay to generate random numbers. Isn't that more exciting than telling them that you bought it?
- Mass manufactured means that if there is a flaw in it, then it is likely that the flaw is present in all units. This means that if someone discovers a weakness in it, then you no longer have "random" numbers. You have predictable numbers like everyone else that bought it.
- Third you are trusting that a person that you don't know has not designed some type of "backdoor" or predictable pattern into the device. Security does not involve trusting people -- especially people you don't know, who care more about your $$$ than your well-being.
- Cite BIOS manufacturers, etc and their known backdoors
- Cite corporate untrustworthiness in general
Build It Yourself (excellence++)
Those of you who are clever, however -- will realize that you can build a hardware RNG for a few bucks from parts bought at an electronics store (or from salvaged parts if you are REALLY clever). You must have a basic knowledge of electronics to do this, but it is nothing you can't learn with a few days worth of reading and a bit of practice handling a soldering iron:
Instructions & Schematics For Homemade RNGs
Basic Electronics Texts
Here's an in depth guide to electronics theory:
- DOE Fundamentals -- Electrical Science:
Here's a basic guide to soldering:
Testing the quality of the RNG output
NIST Tests
http://csrc.nist.gov/groups/ST/toolkit/rng/index.html
The National Institute For Standards and Technology (NIST) lists a battery of tests that it feels are the standard for testing randomness.
rngtest
For a simple test, you can use "rngtest", which is part of the rng-tools package. This will run the NIST FIPS-140-2 tests for randomness.
Installig it is as simple as:
emerge rng-tools dd if=/dev/urandom count=65535 | rngtest
This is a very easy-to-use, quick test to ensure that everything is working properly.
Dieharder
This is one of the most comprehensive RNG test suites available. It incorporates all of the NIST tests, all of the diehard tests (a very respected RNG test-suite written by George Marsaglia), and several unique tests written by the author, Robert G. Brown (the "rgb" tests). This tool is a must have.
There is currently no ebuild for this in portage, so you will have to build it from source yourself. Don't worry, it's not that hard... Not true any more :). See Method 3: Ebuild install --Anka
But first, you need the GNU Scientific Library (gsl) installed. This is simple:
emerge gsl
Now you need to download the latest version of dieharder from here: http://www.phy.duke.edu/~rgb/General/dieharder.php
You can get it in either .tar.gz or .rpm format. Whatever suits your fancy. We'll cover both methods.
Method 1: .rpm install
To do it this way you need rpm installed. If you don't have it installed already:
emerge rpm
Now all you have to do is
Method 2: .tar.gz install
Method 3: Ebuild install
There is now an ebuild for installing dieharder. You can download it at Bug #212674 (dieharder-2.24.7.ebuild and dieharder-2.24.7-build.patch is currently the latest files).
Please read HOWTO_Installing_3rd_Party_Ebuilds for more details.
Place the ebuild in an Portage Overlay. I use app-crypt as category.
Run as root (if dieharder-2.24.7 still is the newest version)
# mkdir -p /usr/local/portage/app-crypt/dieharder # cd /usr/local/portage/app-crypt/dieharder # wget -O dieharder-2.24.7.ebuild http://bugs.gentoo.org/attachment.cgi?id=145547 # mkdir -p /usr/local/portage/app-crypt/dieharder/file # wget -O files/dieharder-2.24.7-build.patch http://bugs.gentoo.org/attachment.cgi?id=145545
The package is "masked by keyword" and you will also have to calculate the digest. Read HOWTO_Installing_3rd_Party_Ebuilds for instructions.
How to use dieharder
dieharder is very well documented. The man pages and this page will tell you all you need to know to get started.
When interpreting the results, the most important thing to remember is that the "p value" is the percent chance that the numbers tested came from a "perfect" RNG. So a very high (100 being the maximum) p value means that the quality of the random numbers from the RNG you tested is very good. If it tells you p = 0.0543, that means that there is only a 5.43% chance that your numbers are really "random". On the other hand, p = 0.994 means that there is a 99.4% chance that your RNG is producing truly random numbers, which is a bit more promising.
GNU Scientific Libary
http://www.gnu.org/software/gsl/manual/html_node/Random-Number-Generation.html
The GNU Scientific library is required for the dieharder tests mentioned above. But it is also useful, in and of itself. It contains several different, high-quality PRNG functions for you to use.
If you are curious info gsl contains a wealth of information on how to use it.
See Also
Created by NickStallman.net, Luxury Homes Australia
Real estate agents should be using interactive floor plans and real estate agent tools.
