Random Record Generation???
-
- Posts: 2
- Joined: Tue Oct 10, 2017 12:44 pm
Random Record Generation???
We are trying to create a report that will pull records out of our database randomly. Is there a function in RRv10 that allows for us to pull records randomly?
-
- Posts: 49
- Joined: Tue Oct 10, 2017 12:44 pm
=> RE: Random Record Generation???
I had this old R&R Tech Tip in a folder. Maybe it will help.______T E C H T I P #7__Randomize__Every once in a while the need to generate a random value pops up in an R&R Report Writer application. In my experience, it^s frequently to randomly pick certain records for audit or other close examination. __Computer programs are deterministic and cannot, therefore, generate a truly random sequence. However, they can fake it very well! (If you are interested in the theory of this, see Chapter 3 in Volume 2 of Donald Knuth^s The Art of Computer Programming.) In the context of R&R, the following UDF was suggested in one of the old Tips & Techniques for R&R Report Writer published by Concentric Data Systems: __Declaration: Random(n_RN) __Expression: MOD(3677*RN + 3, 65536)__Because of the modulus function, the output will be a sequence of integers between 0 and 65535. __Generally it^s more convenient to have numbers between 0 and 1. This is easy to achieve. Just modify the expression to: __Expression: MOD(3677*INT(65536*RN) + 3, 65536)/65536 __To use this UDF you would create a calculated field something like: __MyRandNo: Random(MyRandNo)__Now this expression is somewhat curious! The field name itself is used as the function argument. Well, this is the whole secret of modern random number generators. Each value is based on the previous one. The three constants in the UDF^s expression are not just picked out of a hat. They must meet specific mathematical conditions. See Knuth for the full discussion. __What about the first record, before there is a value for MyRandNo? Well, R&R uses a bank value, which is treated as zero in the expression. That^s right, the first number in the sequence will always be 3 (or 3/65536 if you use the normalized form). __The first value you feed the expression is called the "seed" value. A common trick is the seed a generator with the PC^s real time clock. In R&R we could use an expression like __VAL(STUFF(RIGHT(TIME(), 5), 3, 1, "")__to get a number based on the HHMM part of the TIME() function. To use this for just the first time as the seed value: __Expression: MOD(3677*IIF(RECNO()=1, VAL(STUFF(RIGHT(TIME(), 5), 3, 1, "")), INT(65536*RN)) + 3, 65536)/65536__Note that when you are testing a report, it is often more convenient to use the unseeded expression so you get the same sequence of values every time you try your report. __Since MyRandNo is a field like any other field, you can use it in the query, or filter in SQL, to randomly select a certain percentage of your records. For example, if you wished to pick just 30 percent of your records, you could write a query/filter like: __MyRandNo less than 0.30__An interesting thing happens when you print the MyRandNo as part of the selected record. If you asked R&R to sort, or do anything that requires more than one pass over the data, the values of MyRandNo are often greater than 0.30! What^s going on? Well, in the first pass R&R evaluates MyRandNo for each record and then keeps track of which records to keep. It doesn^t buffer the entire composite records, it just keeps a list of which ones to use in the final report. Then when R&R goes through those records again, it recalculates MyRandNo. If the generator was seeded with the TIME() value, the sequence will be totally different. Even for the unseeded generator, the sequence printed will be the original one now applied to the selected records. This doesn^t mean the random selection is not working. It just means R&R assumes evaluating an expression in the same way will result in the same answer. In the case of random numbers, ____Rick Johnson